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Abstract 

As  researchers  strive  to  achieve  autonomy  in  systems,  many  believe  the  goal  is 
not  that  machines  should  attain  full  autonomy,  but  rather  to  obtain  the  right  level  of 
autonomy  for  an  appropriate  man-machine  interaction.  A  common  phrase  for  this  in¬ 
teraction  is  manned- unmanned  teaming  (MUM-T),  a  subset  of  which,  for  unmanned 
aerial  vehicles,  is  the  concept  of  the  loyal  wingman.  This  work  demonstrates  the 
use  of  optimal  control  and  stochastic  estimation  techniques  as  an  autonomous  near 
real-time  dynamic  route  planner  for  the  DoD  concept  of  the  loyal  wingman.  First, 
the  optimal  control  problem  is  formulated  for  a  static  threat  environment  and  a  hy¬ 
brid  numerical  method  is  demonstrated.  The  optimal  control  problem  is  transcribed 
to  a  nonlinear  program  using  direct  orthogonal  collocation,  and  a  heuristic  particle 
swarm  optimization  algorithm  is  used  to  supply  an  initial  guess  to  the  gradient-based 
nonlinear  programming  solver.  Next,  a  dynamic  and  measurement  update  model  and 
Kalman  filter  estimating  tool  is  used  to  solve  the  loyal  wingman  optimal  control  prob¬ 
lem  in  the  presence  of  moving,  stochastic  threats.  Finally,  an  algorithm  is  written  to 
determin  if  and  when  the  loyal  wingman  should  dynamically  re-plan  the  trajectory 
based  on  a  critical  distance  metric  which  uses  speed  and  stochastics  of  the  moving 
threat  as  well  as  relative  distance  and  angle  of  approach  of  the  loyal  wingman  to  the 
threat.  These  techniques  are  demonstrated  through  simulation  for  computing  the 
global  outer-loop  optimal  path  for  a  minimum  time  rendezvous  with  a  manned  lead 
while  avoiding  static  as  well  as  moving,  non-deterministic  threats,  then  updating  the 
global  outer-loop  optimal  path  based  on  changes  in  the  threat  mission  environment. 
Results  demonstrate  a  methodology  for  rapidly  computing  an  optimal  solution  to  the 
loyal  wingman  optimal  control  problem. 
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OPTIMAL  CONTROL  OF  AN  UNINHABITED  LOYAL  WINGMAN 


I.  Introduction 


sensors  detect  abnormal  movements  of  vehicles  from  a  key  Weapons 
of  Mass  Destruction  (WMD)  storage  site  [in  an  adversarial  nation].  The 
U.N.  authorizes  interception  of  the  WMD  because  proliferation  and  po¬ 
tential  terrorist  use  of  the  WMD  are  greater  risks  than  a  likely  response 
from  [the  adversary  nation].  Penetrating,  high-altitude  airborne  systems 
track  the  vehicle  and  provide  cueing  information  to  incoming  strike  air¬ 
craft.  Launched  from  the  off-shore  aircraft  carrier,  the  strike  package 
comprises  of  manned  tactical  aircraft  with  numerous  combat  support  Un¬ 
manned  Aerial  Systems  (UAS)  providing  tactical  intelligence  communica¬ 
tion  relay,  jamming  support,  and  strike  support.  The  joint  strike  fighter 
operates  as  a  command  ship  and  works  in  concert  with  its  supporting 
unmanned  systems  as  a  seamless  network  of  strike  and  jamming  aircraft. 

The  strike  package  penetrates  [the  adversarial  nation’s]  airspace  and  in¬ 
tercepts,  strikes,  and  stops  the  convoy...”  [1] 

The  vignette  above  is  quoted  from  the  FY2013  Unmanned  Systems  Integrated 
Roadmap  [1]  and  illustrates  a  future  combat  environment  in  which  a  single  manned 
aircraft  operates  as  a  command  ship  working  in  concert  with  unmanned  systems  to 
complete  a  mission  that  is  vital  to  the  security  interests  of  the  U.S.  U.S.  “deployed 
forces  have  seen  how  effective  unmanned  systems  can  be  in  combat  operations...” 
which  “has  created  expectations  for  expanding  the  roles  for  unmanned  systems  in 
future  combat  scenarios  [2].”  The  earlier  published  FY2011  Unmanned  Systems  In¬ 
tegrated  Roadmap  [2]  highlighted  seven  challenges  facing  all  military  service  depart¬ 
ments,  one  of  which  is  Manned-Unmanned  Teaming  (MUM-T),  stating  “DoD  must 
continue  to  implement  technologies  and  evolve  Tactics,  Techniques  and  Procedures 
(TTP)  that  improve  the  teaming  of  unmanned  systems  with  the  manned  force”  [2], 
MUM-T  is  a  concept  which  describes  manned  and  unmanned  systems  working  to- 
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gether  to  achieve  a  common  operational  mission  objective,  and  includes  the  concept 
of  the  ‘  loyal  wingman'.  Much  like  manned  aircraft  of  modern  times  that  fly  in  for¬ 
mation  under  the  tactical  command  of  their  lead,  the  loyal  wingman  is  a  UAV  that 
flies  under  tactical  command  of  a  manned  lead  aircraft  both  while  in  and  out  of  for¬ 
mation.  A  review  of  DoD  documents  in  Section  1.1  reveals  a  loyal  wingman  may  be 
involved  in  numerous  mission  applications,  command  and  control  architectures  and 
potential  scenarios.  Therefore,  this  chapter  establishes  a  definition  as  well  as  a  com¬ 
mand  and  control  structure  and  candidate  scenario  for  the  loyal  wingman  that  drives 
the  research  performed  in  this  work.  This  chapter  additionally  establishes  research 
questions  that,  when  answered  throughout  this  work,  provide  a  contribution  to  the 
existing  body  of  knowledge. 

1.1  Requirement 

Multiple  DoD  requirements  documents  are  examined  to  ensure  the  defined  com¬ 
mand  and  control  architecture  and  candidate  scenario  are  applicable  to  real-world 
user  needs. 

1.1.1  UAS  Roadmap. 

The  2011  Roadmap  [2]  lists  capabilities  that  may  be  met  by  a  loyal  wingman  such 
as  defeating  ground  explosives  from  standoff  distances,  assuring  mobility  to  support 
multiple  points  of  entry,  enabling  movement  and  maneuver  for  projecting  offensive 
operations  and  protecting  austere  combat  posts.  When  determining  a  research  ob¬ 
jective,  these  roadmaps  provide  a  broad  range  of  missions  from  which  to  establish  a 
loyal  wingman  command  and  control  architecture  and  candidate  mission  scenarios. 
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1.1.2  2013-2038  USAF  RPA  Vector. 


The  2013-2038  United  States  Air  Force  (USAF)  Remotely  Piloted  Aircraft  (RPA) 
Vector  [3]  distinguishes  the  loyal  wingman  concept  from  such  topics  as  swarms  of 
unmanned  vehicles.  It  is  envisioned  that  a  loyal  wingman  would  be  used  to  accom¬ 
pany  a  manned  lead  aircraft  to  accomplish  such  missions  as  ISR,  air  interdiction, 
attacks  against  adversary  integrated  air  defenses,  offensive  counter  air  or  act  as  a 
weapons  “mule”  increasing  the  airborne  weapons  availability  to  the  shooter.  The 
USAF  RPA  Vector  [3]  envisions  a  loyal  wingman  aiding  in  manned  missions  acting 
fully  autonomously  by  2030. 

1.1.3  Technology  Horizons. 

The  US  Air  Force  Technology  Horizons,  published  by  the  chief  scientist  of  the  US 
Air  Force  provides  key  science  and  technology  focus  areas  for  scientists  and  engineers 
to  pursue  that  will  provide  technologically  achievable  capabilities  to  enable  the  Air 
Force  to  contribute  to  the  US  joint  force  effectiveness  in  2030  and  beyond  [4],  This 
document  highlights  one  of  three  grand  challenges  for  the  Air  Force  in  the  science 
and  technology  area  as  the  need  of  creating  fractionated,  composable,  survivable, 
autonomous  systems.  A  loyal  wingman  should  address  the  following  four  attributes. 

1.1. 3.1  Fractionated. 

Modern  systems  are  composed  of  subsystems  that  are  physically  integrated  to 
form  a  full  system,  such  as  the  current  aircraft  strike  systems  which  integrate  com¬ 
munication,  ISR,  electronic  warfare  and  strike  capability  into  a  single  platform.  The 
integration  of  multiple  subsystems  into  a  single  system  may  result  in  performance  and 
cost  impacts  such  as  limits  on  range  and  high  production  and  operating  costs.  The 
loss  of  any  one  subsystem  may  result  in  mission  failure,  decreasing  the  survivability 
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of  the  system  as  a  whole.  Moving  from  integrated  to  fractionated,  “the  system  is,  in 
effect,  separated  into  fractional  elements  and  physically  dispersed  [4].”  If  the  frac¬ 
tionation  is  done  properly,  the  communication  between  the  fractionated  subsystems 
remains  small  enough  to  avoid  detection  or  to  continue  to  operate  in  the  presence 
of  jamming.  There  are  multiple  ways  a  loyal  wingman  could  be  used  to  realize  the 
fractionated  system.  One  example  is  to  establish  a  formation  of  UAVs  whose  capa¬ 
bilities  are  the  subsystems  of  a  modern  aircraft  strike  system.  The  manned  lead  acts 
as  the  pilot  controlling  the  various  subsystems,  and  the  subsystems  themselves  are 
“fractionated”  into  physically  dispersed  elements  of  a  formation.  In  order  to  perform 
a  strike,  the  various  fractionated  subsystems  could  fly  to  and  orient  themselves  in  a 
way  that  is  best  suited  for  the  mission  as  a  whole.  The  ISR  subsystem  loyal  wingman 
flies  to  the  area  that  needs  to  be  surveyed,  while  the  electronic  warfare  subsystem 
wingman  flies  to  a  slightly  different  location  and  orients  itself  in  a  way  to  jam  the 
enemy’s  awareness.  Meanwhile,  the  ordnance-carrying  wingman  may  fly  to  its  as¬ 
signed  location  and  orient  itself  most  appropriately  for  unloading  the  ordnance  [4]. 
A  candidate  scenario  should  support  the  use  of  a  single  loyal  wingman  as  a  frac¬ 
tionated  subsystem  supporting  a  large  system  as  well  as  requiring  a  low  amount  of 
communication  to  reduce  risk  of  detection. 

1.1. 3. 2  Composable. 

The  Technology  Horizons  [4]  describes  composable  systems  as  the  ability  to  quickly 
assemble  the  appropriate  capabilities  into  a  package  that  may  meet  a  specific  mission 
requirement.  If  one  considers  a  fractionated  system,  then  a  mission  that  requires  re¬ 
connaissance  may  require  only  one  loyal  wingman.  However,  a  mission  that  requires 
first  locating  a  target,  then  striking  the  target,  may  combine  multiple  fractionated 
subsystems  into  a  single  passage  in  a  rapid  fashion.  The  loyal  wingman  concept 


4 


supports  composability  because  any  number  of  loyal  wingmen  with  varying  capabil¬ 
ities  may  be  packaged  together  in  near  real-time  to  accomplish  any  set  of  proposed 
missions.  Depending  on  the  mission,  these  loyal  wingmen  may  work  in  close  commu¬ 
nication  or  may  be  able  to  handle  their  individual  missions  autonomously  with  little, 
to  no  communication  with  other  loyal  wingmen  or  the  manned  lead. 

1.1. 3. 3  Survivable. 

The  term  survivable  could  pertain  to  either  a  machine  or  human  subsystem.  The 
failure  of  a  subsystem  in  a  fully  integrated  system  more  likely  results  in  full  mission 
failure.  If,  however,  the  system  was  composed  of  fractionated  subsystems  and  one 
of  those  subsystems  fail,  the  mission  may  continue  after  the  loss.  Survivability  may 
also  refer  to  the  survivability  of  the  human  operator.  An  exceptional  justification  for 
unmanned  systems  are  they  allow  humans  to  operate  or  lead  a  mission  from  a  safe 
location.  The  loyal  wingman  may  be  sent  into  an  austere  and  dangerous  environment, 
which  may  put  the  UAV  at  risk,  but  will  increase  the  survivability  of  the  human. 

1.1. 3. 4  Autonomous. 

Current  systems  already  attain  a  certain  level  of  autonomy,  but  the  Technology 
Horizons  [4]  suggests  that  an  increase  in  the  level  of  autonomy  will  produce  a  wider 
range  of  Air  Force  functions  in  the  future.  Key  attributes  of  autonomy  include  com¬ 
plex  decision  making,  autonomous  mission  planning,  and  the  ability  to  self-adapt  as 
the  environment  changes.  A  loyal  wingman  may  fulfill  this  need  by  autonomously 
planning  a  mission  that  meets  the  manned  leads  commanded  mission  objectives  and 
dynamically  re-plan  the  mission  in  the  face  of  a  changing  mission  environment  with 
a  minimal  amount  of  communication  with  the  manned  lead  or  other  loyal  wingman 
subsystems. 
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1.1.4  Air  Force  Research  Laboratory  Autonomy  Strategy. 

Derived  from  the  previously  mentioned  Technology  Horizons  [4],  the  Air  Force 
Research  Laboratory  (AFRL)  established  their  autonomy  science  and  technology 
strategy  with  a  vision  of  “Intelligent  machines  seamlessly  integrated  with  humans 
-  maximizing  mission  performance  in  complex  environments”  [5].  The  third  of  four 
goals  is  to  ensure  continuation  of  the  mission  in  environments  with  kinetic  and  non- 
kinetic  threats.  In  order  to  meet  this  goal,  systems  must  be  developed  which  are  able 
to  protect  themselves  and  adapt  to  the  environment  faster  than  an  adversary.  The 
loyal  wingman  will  support  this  vision  by  adjusting  to  a  changing  threat  environment 
and  appropriately  re-planning  its  mission. 

1.2  Loyal  Wingman  Definition  and  Candidate  Scenario 

The  documents  discussed  in  Section  1.1  [1,  2,  3,  4]  allow  a  broad  framework  for  the 
command  and  control  architecture  as  well  as  the  specific  use  of  a  loyal  wingman.  These 
broad  definitions  allow  scientists  and  engineers  the  freedom  to  research  various  tools 
and  techniques  for  realizing  loyal  wingman  technology.  In  order  to  identify  a  specific 
research  objective,  this  section  establishes  a  specific  definition  for  the  term  loyal 
wingman  and  a  command  and  control  framework.  In  addition,  a  candidate  mission 
scenario  is  established  that  when  solved,  answers  the  research  questions  produced 
herein. 

1.2.1  Loyal  Wingman  Definition. 

Throughout  this  work  the  term  loyal  wingman  refers  to  an  uninhabited  aerial  ve¬ 
hicle  that  flies  autonomously  under  the  tactical  command  of  a  manned  lead  air  vehicle 
and  includes  terms  such  as  unmanned  loyal  wingman,  uninhabited  loyal  wingman  and 
loyal  wingman.  The  uninhabited  loyal  wingman  is  not  a  remotely  piloted  aircraft.  In- 
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stances  that  reference  a  vehicle  controlled  or  piloted  by  a  man  will  specify  “manned” 
as  part  of  the  description. 

1.2.2  Loyal  Wingman  Research  Framework. 

Morales  [6]  provides  an  informal  survey  to  address  who  is  in  control  (man  or 
machine)  of  an  uninhabited  vehicle  and  under  what  conditions.  The  consensus  is 
that  a  manned  lead  pilot  is  already  saturated  with  tasks  and  wishes  to  push  as  many 
decisions  as  possible  to  their  wingmen.  Regarding  an  uninhabited  loyal  wingman, 
this  implies  as  much  autonomy  in  the  unmanned  system  as  is  practical.  Therefore 
a  command  and  control  framework  consistent  with  modern  manned  operations  is 
defined  in  which  a  manned  lead  provides  clear  and  concise  direction  with  very  little 
need  of  communication  bandwidth.  The  uninhabited  loyal  wingman  must  perform 
the  following  tasks: 

•  Autonomously  compute  a  mission  plan  based  on  communication  from  lead 

•  Dynamically  re-plan  mission  in  event  of  a  changing  threat  environment  or  chang¬ 
ing  mission  requirement 

Other  frameworks  are  possible  areas  of  research,  but  are  not  pursued.  These 
other  frameworks  include  the  Unmanned  Aerial  Vehicle  (UAV)  acting  as  a  central 
planner  for  all  other  UAVs  in  the  formation.  Such  a  framework  would  consider  a 
conglomerate  of  tasks  such  that  a  computer  onboard  a  single  UAV  computes  mission 
paths  for  all  UAVs  in  formation  and  communicates  these  paths  to  other  UAVs  in 
the  formation  to  accomplish  all  mission  tasks.  An  additional  framework  may  be  a 
collaborative  control  environment  similar  to  that  of  swarming  behaviors  in  which  all 
UAVs  in  formation  are  in  constant  communication  with  each  other  canvassing  which 
vehicle  is  accomplishing  which  task  and  continually  working  together  to  determine 
the  optimal  solution  to  accomplish  all  mission  tasks. 
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1.2.3  Candidate  Scenario. 


A  formation  of  aerial  vehicles,  including  a  manned  lead  accompanied  by  four 
unmanned  loyal  wingmen  enter  an  austere  environment.  The  manned  lead  receives 
a  multiple-objective  tasking  that  includes  dropping  ordnance  on  multiple  locations. 
The  manned  lead  distributes  a  task  to  a  single  loyal  wingmen,  which  includes  dropping 
ordnance  on  a  single  location  and  meeting  back  in  the  formation.  The  mission  and 
several  fixed-  and  variable-time  and  variable-location  endpoint  scenarios  must  be 
accomplished. 

The  mission  occurs  in  an  austere  environment  in  which  threats  exist,  such  as  anti¬ 
aircraft  artillery.  Many  threats  will  be  non-deterministic  and  dynamic,  meaning  the 
threats  are  only  approximately  known  and  change  with  time.  There  also  will  exist 
scenarios  in  which  accomplishing  the  mission  objectives  while  fully  avoiding  these 
threats  is  infeasible  due  to  a  fortified  target;  therefore,  the  loyal  wingman  autonomous 
mission  planner  must  be  able  to  receive  the  tasked  mission  plan  and  compute  a  mission 
path  that  accomplishes  the  primary  mission  objectives  while  either  fully  avoiding  or 
minimizing  exposure  to  threats,  and  if  necessary,  minimizing  time  to  rendezvous. 

Upon  execution  of  the  computed  mission  plan,  intelligence  may  identify  additional 
static  or  dynamic  “pop-up”  threats  that  were  not  known  when  the  original  path  was 
computed.  The  loyal  wingman  must  autonomously  determine  whether  a  change  in 
the  mission  environment  warrants  a  change  in  the  original  path  and,  if  necessary, 
update  the  original  path  and  continue  the  mission.  As  a  means  of  realizing  a  more 
autonomous  system,  communication  will  be  limited  to  the  manned  lead  providing  the 
mission  objective  as  well  as  communication  on  the  changing  threat  environment.  The 
loyal  wingman  will  only  communicate  its  ability  to  accomplish  the  mission  and  there 
will  be  no  communication  between  the  loyal  wingmen  on  their  individual  mission 


1.3  Research  Objective  and  Contributions 


Chapter  II  provides  a  summary  review  of  literature  relevant  to  solving  the  loyal 
wingman  problem.  This  includes  a  look  at  MUM-T  technical  works  which  focus  on 
cognitive  systems  engineering,  defining  where  the  man’s  task  should  end  and  the 
computer’s  task  begin,  as  well  as  proposed  solution  techniques  for  solving  various 
definitions  of  the  loyal  wingman  problem.  There  is  a  vast  amount  of  research  avail¬ 
able  on  techniques  for  controlling  UAVs,  in  general,  and  a  significant  amount  of  that 
research  may  be  applied  to  various  loyal  wingman  definitions.  Chapter  II  addition¬ 
ally  reviews  various  methods  for  modeling  static  threats  as  well  as  modeling  and 
estimating  threats  that  are  moving  and  non-detcrministic.  As  a  result  of  the  review, 
it  is  proposed  that  the  loyal  wingman  problem  as  defined  herein  can  be  formulated 
and  solved  using  optimal  control  and  stochastic  estimation  techniques.  The  objective 
of  this  research  is  to  contribute  to  solving  the  DoD  concept  of  the  loyal  wingman 
with  an  assumed  level  of  autonomy  using  optimal  control  and  stochastic  estimation 
techniques.  This  will  be  shown  by  answering  the  following  three  research  questions: 

1.  How  do  you  formulate  the  uninhabited  loyal  wingman  optimal  control  problem 
in  the  presence  of  static,  deterministic  threats? 

2.  How  do  you  formulate  the  uninhabited  loyal  wingman  optimal  control  problem 
with  moving,  non-detcrministic  threats? 

3.  How  and  when  do  you  dynamically  re-plan  the  uninhabited  loyal  wingman  opti¬ 
mal  control  problem  in  the  presence  of  various  deterministic  and  non-deterministic 
pop  up  threats  that  may  arise  during  execution  of  a  pre-computed  path? 

By  answering  these  three  questions,  several  contributions  are  made  that  both  add  to 
the  existing  body  of  knowledge  as  well  as  spawn  additional  areas  of  research.  These 
contributions  include: 
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1.  a  methodology  for  a  rapid  near  optimal  solution; 


2.  potential  use  in  a  near  real-time  environment; 

3.  a  near  optimal  solution  for  comparison  against  other  methods. 

1.4  Assumptions  and  Limitations 

Assumptions  and  limitations  stated  throughout  the  work  are  now  summarized. 

1.4.1  Mission  Scenario  Assumptions  and  Limitations. 

Manned  pilots  view  the  main  purpose  of  having  a  manned  loyal  wingman  is  to 
look  after  and  protect  one- another  in  austere  environments  through  ‘mutual  support’; 
however,  the  loyal  wingman  is  unable  to  provide  mutual  support  because  there  are 
no  suitable  sensors  on  board  and  the  capability  does  not  exist  for  a  UAV  to  engage 
in  autonomous  aerial  combat  maneuvering  against  a  manned  or  unmanned  enemy. 
Generally,  the  loyal  wingman  must  avoid  detection  by  enemy  aircraft  because  it  is 
assumed  the  loyal  wingman  cannot  survive  an  aerial  combat  situation. 

Although  a  primary  benefit  of  the  use  of  UAVs  is  to  reduce  the  risk  of  putting  a 
manned  pilot  in  danger,  the  uninhabited  loyal  wingman  is  not  considered  attritable 
and  effort  must  be  made  to  ensure  the  uninhabited  loyal  wingman’s  safe  return  to 
the  formation. 

Communication  between  a  manned  lead  and  the  loyal  wingman  will  generally  be 
established  on  pre-defined  terms  that  a  computer  may  interpret.  The  “dictionary” 
in  this  research  will  be  limited  to  intermediate  waypoint  and  rendezvous  times  and 
objectives.  There  is  no  communication  between  the  individual  unmanned  vehicles  in 
the  formation. 

Computational  resources  on  board  the  UAV  are  limited  and  allocated  to  many 
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subsystems;  therefore,  resource  allocation  for  route  planning  is  limited  to  initial  com¬ 
putation  and  in  cases  when  the  mission  environment  warrants  a  trajectory  re-plan. 
There  will  not  be  a  high-frequency  re-plan  of  the  trajectory  as  this  will  consume 
valuable  resources. 

The  loyal  wingman  must  be  able  to  make  online,  near  real-time  calculations. 
There  is  not  a  pre-dehned  mission  or  a  library  of  pre-defmed  maneuvers  or  paths  as 
in  Carter  [7].  The  loyal  wingman  computes  the  mission  when  the  task  is  provided  by 
the  manned  lead.  Although  computational  efficiency  will  be  leveraged  where  it  can  be 
found,  the  goal  of  this  research  is  not  to  attain  the  computation  time  needed  for  real¬ 
time  implementation  and  it  is  expected  that  there  may  be  other  computational  tools 
and  better  programming  techniques  that  can  be  used  to  achieve  true  near  real-time 
computation. 

1.4.2  Loyal  Wingman  Model  Assumptions  and  Limitations. 

Due  to  the  similarities  with  Smith  [8],  a  modified  version  of  the  five-state,  3 
degree  of  freedom  model  is  used  in  which  roll  rate  and  normal  acceleration  rate 
are  the  controls.  Because  the  mission  will  be  performed  in  a  local  environment,  a 
flat,  non-rotating  Earth  is  assumed.  High  frequency  updates  on  the  position  of  non- 
deterministic  threats  are  available  from  a  sensor  onboard  the  loyal  wingman.  Low 
frequency  updates  on  threat  and  mission  environment  are  provided  via  a  friendly  ISR 
asset  operating  in  the  vicinity  of  the  mission. 

The  work  herein  seeks  to  obtain  the  optimal  path  for  the  loyal  wingman  to  accom¬ 
plish  the  established  mission;  this  is  often  referred  to  as  outer-loop  control.  Following 
computation  of  the  outer-loop  optimal  path,  it  is  then  necessary  to  utilize  an  inner- 
loop  controller  in  order  to  appropriately  track  that  path.  Additional  information  on 
techniques  such  as  sliding  mode  control  [9]  used  by  Harl  [10]  and  the  linear  quadratic 
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regulator  used  by  Stankovic  [11]  and  Venkataramanan  [12,  13]  are  available,  however 
it  is  assumed  these  or  any  other  inner-loop  tracking  techniques  are  capable  of  tracking 
the  computed  optimal  path  and  are  not  pursued  as  part  of  this  research.  Wind  is 
not  modeled  because  the  assumed  inner-loop  controller  will  be  built  with  appropriate 
models  to  account  for  disturbances  due  to  wind. 

1.4.3  Threat  Assumptions  and  Limitations. 

It  is  assumed  that  threats  can  be  modeled  analytically  using  superquadrics  [14]. 
For  simplicity,  dynamic  threats  will  travel  along  a  straight  road  whose  coordinates 
are  known  a  priori.  The  threat  will  remain  on  the  road,  but  the  location  along  the 
road  at  any  time  is  not  known.  A  linear  dynamic  model  is  developed  and  estimates 
may  be  assumed  to  have  a  Gaussian  distribution. 

1.5  Document  Outline 

Chapter  I  motivated  the  problem  and  established  a  definition  and  candidate  sce¬ 
nario  in  order  to  narrow  the  specific  area  of  research.  Chapter  II  provides  a  review  of 
literature  supporting  how  this  research  contributes  to  the  existing  body  of  knowledge 
as  well  as  provides  a  set  of  techniques  for  enabling  a  solution.  In  Chapter  III,  the  op¬ 
timal  control  problem  is  formulated  for  various  scenarios  and  initial  results  highlight 
a  challenge  in  the  optimal  control  community  with  gradient-based  numerical  methods 
getting  ‘stuck’  in  locally  optimal  solutions.  Chapter  IV  discusses  a  hybrid  gradient- 
heuristic  technique  to  ameliorate  the  challenge  identified  in  Chapter  III  and  results 
for  a  3-D  model  are  presented  in  Chapter  V  to  indicate  a  rapid  technique  to  provide 
an  accurate  locally  optimal  solution.  Chapter  VI  answers  research  question  two  by 
identifying  a  dynamic  and  measurement  update  model  as  well  an  estimation  tool  to 
account  for  and  model  moving,  non-detcrministic  threats.  The  final  research  question 
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is  answered  in  Chapter  VII  by  developing  a  mission  algorithm  flow-chart  that  com¬ 
putes  and  simulates  a  trajectory,  and  provides  a  formulation  for  determining  if  and 
when  a  trajectory  re-plan  should  occur.  Finally,  Chapter  VIII  provides  a  summary  of 
relevant  contributions  and  recommendations  for  future  work.  The  Appendices  pro¬ 
vide  additional  detail  and  support  for  various  technical  discussions  throughout  the 
document. 
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II.  Literature  Review 


MUM-T  and  the  loyal  wingman  are  mission  applications  ripe  for  exploring  various 
tools  and  techniques  that  may  be  used  to  realize  their  concepts.  This  chapter  begins 
with  a  review  of  works  in  the  open  technical  literature  that  attempt  to  solve  various 
aspects  of  the  MUM-T  problem.  In  order  for  the  research  herein  to  establish  a 
methodology  that  will  contribute  to  the  existing  body  of  knowledge,  a  review  is  first 
performed  to  examine  the  control  techniques  that  are  used  across  the  four  attributes 
of  UAV  formation  flying.  Next,  a  discussion  on  nonconvex  optimal  control  leads  to  a 
hybrid  technique,  using  a  particle  swarm  optimization  algorithm  to  seed  a  Nonlinear 
Programming  (NLP)  solver.  The  remainder  of  the  chapter  provides  a  dynamic  model 
for  the  loyal  wingman,  a  review  of  methods  for  modeling  threats,  as  well  as  a  discussion 
on  stochastic  estimation  techniques  for  dynamic  trajectory  re-planning. 

2.1  MUM-T  and  Loyal  Wingman  Literature 

The  DoD  established  a  roadmap  for  UAVs  [1,  2]  and  established  autonomy  as  a 
grand  challenge  [4],  Clough  [15]  highlights  challenges  to  achieving  autonomy  in  UAVs, 
while  others  such  as  Yang  [16]  provide  a  recommendation  on  the  appropriate  balance 
in  autonomy  to  achieve  the  ‘right’  level  of  interaction  between  man  and  machine. 
Gangl  [17]  offers  an  experiment  in  the  management  of  multiple  UAVs  from  a  single- 
seat  fighter,  while  Schmitt  [18]  provides  a  mission  planner  able  to  support  a  pilot 
during  MUM-T  missions,  and  Jameson  [19]  recommends  an  approach  for  teams  of  ve¬ 
hicles  to  coordinate  their  activities  without  human  oversight.  Potential  applications 
of  such  research  was  reviewed  by  Svenmarck  [20].  Van  Riper  [21]  and  Durbin  [22] 
each  provide  an  assessment  of  operator  workload  on  a  current  operational  application 
of  MUM-T:  the  Army  Apache  AH-64D.  Additionally,  the  Navy  successfully  demon- 
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strated  the  first-ever  carrier  launch  and  recovery  of  an  unmanned  aircraft  [23]. 

The  U.S.  Air  Force  is  developing  its  version  of  MUM-T  in  the  concept  of  the  loyal 
wingman,  technology  that  differs  from  swarming  because  the  unmanned  system  will 
work  with  and  interact  with  the  manned  aircraft  in  various  tactical  missions  [4] .  The 
loyal  wingman  serves  a  manned  lead  in  two  broad  categories:  mutual  support  and  for¬ 
mation  flying.  Mutual  support  stems  from  the  Airman’s  Creed  and  the  idea  that  one 
will  “never  leave  an  airman  behind  [24].”  This  gives  rise  to  the  idea  of  a  loyal  wing- 
man  supporting  the  manned  lead  through  aerial  combat.  Carter  [7]  created  a  library 
of  time  optimal  maneuvers  for  use  by  a  remote  pilot.  McGrew  [25]  used  approximate 
dynamic  programming  and  Horie  [7]  used  a  two-sided  optimization  approach  to  solve 
the  highly  dynamic  minimax  problem  existent  in  the  pursuer-evader  control  problem. 
The  computational  complexity  of  the  highly  dynamic,  non-deterministic  mutual  sup¬ 
port  mission  makes  it  impractical  to  develop  real-time,  on-line  optimal  solutions  and 
is  therefore  not  pursued  in  the  research  herein. 

The  second  broad  category  for  supporting  a  manned  lead  is  formation  flying. 
Murphey  [26]  identifies  a  number  of  challenges  associated  with  integrating  unmanned 
systems  into  the  joint  force,  to  include  formation  flight  for  autonomous  aerial  refuel¬ 
ing,  which  DARPA  successfully  demonstrated  [27].  Waydo  [28]  specifically  addressed 
the  concept  of  the  loyal  wingman  which  describes  a  flight  test  using  inner-loop  op¬ 
timal  control  techniques  to  demonstrate  safe  formation  flight  of  a  loyal  wingman  in 
the  event  of  a  loss  of  communication.  The  test  objective  was  to  maintain  formation, 
but  in  the  event  of  a  loss  of  communication,  the  loyal  wingman  departed  to  a  safe 
formation  flying  distance,  then  returned  to  normal  formation  distance  once  the  com¬ 
munication  was  re-established.  Waydo  demonstrated  optimal  control  techniques  for 
maintaining  formation,  but  did  not  indicate  the  loyal  wingman’s  ability  to  perform 
splinter  activities  beginning  and  ending  in  formation,  nor  does  it  consider  avoidance 


15 


of  threats. 


Performed  in  support  of  the  U.S.  Army,  Garcia’s  [29]  research  goal  was  to  introduce 
a  method  to  integrate  UAS  into  a  highly  functional  manned/unmanned  team  through 
design  and  implementation  of  3-D  distributed  formation/flight  control  algorithms 
with  the  goal  to  act  as  wingmen  for  manned  aircraft.  Artificial  potential  functions 
simulated  the  ability  of  a  group  of  unmanned  loyal  wingmen  to  initialize  formation 
through  a  self-ordering  algorithm,  hold  a  standard  formation,  dynamically  modify  the 
formation  and  perform  a  splinter  activity,  which  are  the  four  attributes  of  formation 
flying.  Additionally,  non-deterministic  threats  were  avoided  using  fuzzy  reasoning. 
Additional  discussion  on  the  methods  used  by  Garcia  is  found  in  Section  2.2. 

The  definitions  and  scenarios  that  are  solved  vary  greatly  in  each  of  the  works  of 
this  review,  which  indicates  there  is  still  a  significant  amount  of  research  remaining  for 
solving  MUM-T  applications.  Garcia’s  [29]  mission  scenario  most  closely  resembled 
the  loyal  wingman  as  defined  herein.  However,  additional  research  is  necessary  to 
solve  the  loyal  wingman  problem  as  defined  herein,  because  threats  must  be  avoided 
and  exposure  minimized  while  meeting  optimization  criteria.  Additionally,  the  use  of 
fuzzy  reasoning  is  not  an  optimal  method  for  estimating  the  state  of  non-deterministic 
threats.  The  research  herein  provides  a  contribution  to  the  engineering  community  by 
demonstrating  an  optimal  control  and  stochastic  estimation  methodology  for  solving 
the  loyal  wingman  problem  as  defined  herein.  The  methodology  will  be  chosen  from 
a  review  of  techniques  used  to  control  UAVs  in  the  next  section. 

2.2  UAV  and  Formation  Control  Techniques 

In  order  to  address  the  research  goals,  this  section  reviews  techniques  that  are 
used  throughout  literature  to  control  UAVs  and  groups  of  UAVs.  The  desire  is  to 
determine  a  methodology  that  will  aid  in  solving  the  loyal  wingman  candidate  scenario 
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as  defined  herein  that  is  unique  in  relation  to  other  works  that  have  addressed  the  loyal 
wingman.  As  a  reminder,  Section  1.4  established  that  inner- loop  control  will  not  be 
addressed  in  the  work  herein.  The  methods  discussed  in  the  following  four  subsections 
address  outer-loop  techniques  to  control  UAVs  and  groups  of  UAVs  throughout  the 
four  attributes  of  formation  flight.  For  the  purposes  of  the  ensuing  discussion,  outer- 
loop  control  will  be  further  parsed  into  two  categories,  one  in  which  periodic  tracking 
updates  provide  for  sequential  control.  The  other  category  entails  calculating  the 
vector  of  control  inputs  for  a  full  outer-loop  path  trajectory. 

2.2.1  Formation  Rendezvous. 

Formation  rendezvous  is  where  two  or  more  vehicles  join  together  from  physically 
separate  locations  into  a  formation  that  is  intended  to  be  held  for  some  finite  period  of 
time1.  Techniques  for  controlling  a  vehicle  to  a  successful  rendezvous  include  graph 
theory,  the  Dijkstra  algorithm,  proportional  navigation,  line  of  sight,  sliding  mode 
terminal  guidance,  and  artificial  potential  functions,  all  of  which  are  of  the  category 
of  outer-loop  control  defined  herein  that  include  sequential  path  planning  control. 

Giuletti  [30]  discusses  graph  theory  and  the  Dijkstra  algorithm  to  define  an  optimal 
formation  configuration.  The  objective  function  is  set  to  optimize  the  communication 
and  data  transfer  amongst  the  vehicles  in  the  formation.  Graph  theory  is  utilized  by 
supposing  the  individual  aircraft  are  nodes  on  a  graph  and  the  lines  of  communication 
between  them  as  the  edges  that  connect  the  nodes.  By  establishing  a  set  of  assump¬ 
tions,  the  problem  is  reduced  to  a  shortest  path  problem  and  solved.  The  Dijkstra 
algorithm  is  chosen  because  it  has  only  polynomial  complexity,  guarantees  optimality 
and  is  deterministic.  Graph  theory  and  the  Dijkstra  algorithm  are  mentioned  upfront 

because  their  use  is  pervasive  in  the  technical  literature.  However,  their  use  here  is 

1  Target  rendezvous  entails  the  problem  of  multiple  vehicles  from  separate  physical  locations 
coordinating  on  their  arrival  at  a  location.  Although  they  may  rendezvous  in  a  close  vicinity,  in  this 
case,  they  are  not  joining  together  into  a  formation. 
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in  determining  the  optimal  formation,  not  in  determining  how  to  maneuver  a  vehicle 
or  series  of  vehicles  into  formation. 

Used  by  Smith  [31]  to  guide  a  vehicle  for  aerial  refueling,  Proportional  Navigation 
(PN),  “generates  control  accelerations  proportional  to  the  measured  rate  of  rotation 
of  the  interceptor-target  line-of-sign  and  directs  these  forces  so  as  to  reduce  or  hold 
constant  the  rotational  rate  of  the  line-of-sight  [32].”  The  MIT  Draper  Laboratory  [33] 
developed  and  tested  a  flight  control  system  to  achieve  mid-air  rendezvous  of  two 
UAVs.  Tahk  [34]  proposes  using  only  line-of-sight  angles  to  other  vehicles  measured 
by  visual  sensors  or  radars,  which  requires  little  to  no  communication,  makes  the 
system  more  survivable  to  communication  breakdowns,  and  aids  in  allowing  a  stealth 
mission.  Tahk  uses  two  separate  guidance  laws  that  put  the  vehicles  on  the  correct 
path  to  formation  rendezvous,  using  line-of-sight  only. 

Harl  [10]  uses  Sliding  Mode  Terminal  Guidance  (SMTG)  [35]  to  rendezvous  air¬ 
craft  to  a  final  formation  location  because  communication  is  not  needed  when  using 
sliding  mode.  The  sliding  surfaces  used  are  the  terminal  constraint,  which  is  the  final 
formation  position  for  each  UAV.  A  vector  of  m  sliding  mode  vectors  are  established 
and  they  are  “slid”  to  zero  by  defining  a  Lyaponov  function  and  setting  its  derivative 
to  zero.  This  method  does  not  guarantee  the  UAVs  arrive  at  the  same  time,  but  it 
does  guarantee  they  will  arrive  in  a  finite  time. 

Stickney  [36]  describes  the  use  of  artificial  potential  functions  may  be  used  by 
identifying  an  objective  function  as  an  attractive  force  and  the  constraint  function 
as  a  repulsive  force.  This  allows  the  vehicle  to  be  controlled  in  such  a  way  that  it  is 
attracted  to  objectives  and  repulsed  from  constraints. 

Garcia  [29]  performs  a  formation  rendezvous  using  artificial  potential  functions  by 
assigning  a  global  variable  to  the  manned  lead  vehicle  which  “tells”  the  unmanned 
vehicle  where  it  should  go  regardless  of  where  it  begins.  The  first  unmanned  UAV  to 
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arrive  by  way  of  an  artificial  potential  field  sink,  assigns  itself  as  the  follower  of  the 
manned  leader  and  sets  a  search  for  a  follower  to  join  its  lead.  This  eliminates  the  need 
for  the  global  variable  and  each  subsequent  unmanned  UAV  begins  to  join  formation  in 
a  self-ordering  algorithm  according  to  the  identified  desired  formation.  The  algorithm 
searches  for  a  lead  UAV  which  has  a  follower  spot  open  and  the  algorithm  continues 
with  each  UAV  assigning  to  itself  a  leader  and  a  follower  until  all  UAVs  have  a  “spot” 
in  the  formation.  This  self-ordering  algorithm  establishes  a  communication  scheme 
that  becomes  valuable  during  other  attributes  of  formation  flight:  formation  hold, 
formation  reconfiguration  and  splinter  activities  [37]. 

2.2.2  Formation  Hold. 

Following  successful  rendezvous  at  the  appropriate  location,  the  challenge  then 
turns  to  controlling  the  UAV  for  maintaining  the  formation.  All  of  the  methods 
identified  in  this  section  are  in  the  outer-loop  control  category  defined  herein  for 
sequential  path  planning. 

Artificial  potential  functions  are  used  as  examples  of  control  in  both  formation 
rendezvous  and  maintaining  the  formation.  Paul  [38]  provides  a  description  on  the 
components  of  a  potential  held  and  how  a  3-D  potential  held  may  be  constructed. 
After  using  a  self-ordering  algorithm  for  formation  rendezvous,  Garcia  [29]  uses  ar¬ 
tificial  potential  functions  to  maintain  the  formation.  A  challenge  with  maintaining 
formation  due  to  communication  latiency  was  addressed  with  fuzzy  logic.  The  work 
herein  will  consider  other  optimal  estimation  techniques  (Section  2.6). 

Tanner  [39]  uses  graph  theory  [40]  to  spatially  relate  components  of  a  hock  to  one 
another  and  then  used  artificial  potential  functions  as  the  control  law.  Tanner  [39]  be¬ 
gins  by  using  a  hxed  topology,  meaning  the  set  of  ‘neighbors’  surrounding  the  ‘agent’ 
is  hxed  with  time.  This  hocking  model  superimposes  three  steering  behaviors,  sepa- 
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ration,  alignment  and  cohesion  that  result  in  all  agents  moving  toward  a  formation 
while  avoiding  collisions.  Tanner  [41]  then  uses  the  same  methodology  to  maintain 
formation  with  a  dynamic  neighbor  topology. 

‘Swarming’  is  an  attempt  at  extending  what  is  observed  in  nature  to  swarms  of 
autonomous  vehicles.  Consider  a  flock  of  birds  or  school  of  fish.  These  animals  are  in 
formation  -  though  no  one  told  them  to  get  into  formation.  They  fly  together,  swim 
together,  turn  together,  attain  a  goal  together,  yet  there  is  no  clear  leader  among 
them.  Each  of  the  individual  entities  of  the  flock  seem  to  naturally  relate  to  the 
members  surrounding  them  in  such  a  way  that  the  group  as  a  whole  acts  as  one. 
The  engineering  community  has  attempted  to  describe  and  repeat  this  behavior  in 
autonomous  vehicles. 

Kovacina  defines  emergent  behavior  as  “the  ability  to  accomplish  complex  objec¬ 
tives  through  synergistic  interactions  of  simple  reactionary  components  [42].”  Cucker 
[43]  uses  the  example  of  the  study  of  emergent  behavior  in  a  flock  of  birds,  and  presents 
a  mathematical  model  on  the  tendency  of  the  flock  to  converge  to  a  common  velocity. 
Kovacina  [42]  develops  a  control  algorithm  and  demonstrates  the  effectiveness  via 
simulation  of  a  swarm  of  vehicles  to  search  and  map  a  chemical  cloud.  In  this  emer¬ 
gent  behavior  control  algorithm,  individual  agents  gather  information  in  their  local 
environment  and  share  that  information  with  the  rest  of  the  swarm.  As  information 
increases,  the  swarm  will  flock  to  the  area  where  help  is  needed  to  accomplish  the 
goal.  With  this  behavior,  it  is  claimed  that  a  task  that  is  not  possible  for  any  one 
agent  can  be  accomplished  by  a  group  of  agents. 

Often  used  as  an  inner-loop  control  technique,  the  Linear  Quadratic  Regulator 
(LQR)  uses  feedback  to  adjust  the  control  input  in  order  to  maintain  a  referenced 
trajectory.  Outside  of  the  application  to  UAVs,  Stankovic  [11]  creates  a  scenario  of 
a  platoon  of  vehicles  traveling  in  a  straight  line  formation  as  part  of  an  integrated 
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vehicle  highway  system.  LQR  optimization  is  used  to  maintain  a  constant  separation 
distance  between  the  vehicles. 

Venkataramanan  [12]  uses  the  MATLAB/Simulink  software  to  simulate  the  inner- 
and  outer-loop  control  structure  for  a  group  of  UAVs  flying  in  formation.  The  outer- 
loop  controller  consisted  of  the  virtual  leader’s  motion,  the  individual  UAV,  the  indi¬ 
vidual  UAV  reference  trajectory,  and  the  individual  UAV’s  controller.  The  inner-loop 
controller  consisted  of  UAV  equations  of  motion,  engine  dynamics  and  vortex-effect 
model.  Results  revealed  robustness  to  swapping  leader  and  follower  and  still  main¬ 
taining  appropriate  formation  separation  distances. 

Without  prediction,  a  controller  provides  control  inputs  that  are  based  on  present 
and  prior  knowledge.  By  adding  a  guess  or  prediction  on  future  desires,  the  Model 
Predictive  Controller  (MPC)  [44]  provides  inputs  based  on  the  prediction  of  future 
states.  Wcihua  [45]  uses  an  MPC  controller  primarily  to  ensure  collision  avoidance 
while  a  formation  of  UAVs  attempts  navigation  to  a  goal  location.  However,  the 
nature  of  the  decentralized  leader /follower  model  may  encourage  the  leader  to  avoid 
an  obstacle  and  cause  the  formation  to  fall  apart.  To  combat  this,  Weihua  establishes 
a  formation  error  term  which  assigned  a  cost  to  the  value  of  maintaining  the  formation 
while  navigating  between  obstacles  on  route  to  the  final  destination. 

Wang  [46]  controls  a  swarm  of  UAVs  based  on  the  presence  of  obstacles.  Without 
obstacles,  the  path  is  computed  in  safe  mode  using  an  outer-loop  LQR  path  generator. 
While  in  danger  mode  the  outer-loop  path  is  generated  using  a  Grossberg  Neural 
Network  [47].  Once  the  path  has  been  computed,  an  inner- loop  MPC  is  used  for 
tracking  the  path. 

Chao  [48]  designs  a  collision- free  UAV  formation  flight  controller  utilizing  a  de¬ 
centralized  MPC.  A  cost  function  is  designed  based  on  velocity  orientation  and  a 
distribution  between  UAV  and  obstacle  avoidance  to  guarantee  inter-vehicle  collision 
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avoidance. 


Boskovic  [49,  50,  51]  designs  a  formation-hold  autopilot  such  that  errors  converge 
to  zero  asymptotically  in  the  face  of  unknown  leader  maneuvers. 

Weihua  [52]  applies  Multiplexed  Model  Predictive  Control  (MMPC)  [53]  to  de¬ 
centralized  formation  flying.  A  typical  MPC  assumes  all  variables  are  updated  si¬ 
multaneously,  which  results  in  a  high  computational  cost  associated  with  multiple 
subsystems  that  are  part  of  the  formation.  Utilizing  the  MMPC  scheme,  the  MPC 
problem  is  solved  for  each  subsystem  (or  elements  of  the  formation)  sequentially  and 
updates  the  other  elements  of  the  subsystem  as  soon  as  the  solution  is  available.  The 
computational  efficiency  of  using  the  MMPC  approach  allowed  the  UAVs  to  adjust 
their  trajectory  sooner  and  thus  converge  more  rapidly  to  the  final  target. 

2.2.3  Formation  Reconfiguration. 

After  joining  formation  and  maintaining  that  formation,  the  mission  may  require 
a  formation  reconfiguration.  Events  such  as  failure  of  one  or  more  communication 
channels,  sensor  failure,  flight  path  restrictions  or  even  full  loss  of  aircraft  are  reasons 
a  formation  may  need  to  control  itself  into  a  new  configuration  [13].  In  addition 
to  the  control  techniques  reviewed  in  the  previous  sections,  the  artificial  potential 
function,  particle  swarm  optimization,  and  direct  collocation  are  methods  used  to 
control  formation  reconfiguration. 

Garcia  [29]  uses  artificial  potential  functions  to  self-order  into  a  new  formation, 
just  as  occurred  in  formation  rendezvous.  Duan  [54]  uses  an  “improved”  Particle 
Swarm  Optimization  (PSO)  to  find  the  optimal  reconfiguration  of  multiple  vehicles 
flying  in  a  formation.  PSO  is  discussed  further  in  Section  2.4.  APF  and  PSO  are 
sequential  outer-loop  path  planning  techniques,  but  Ma  [55]  used  an  outer-loop  control 
method  known  as  direct  orthogonal  collocation  to  determine  the  optimal  path  to 
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reconfigure  a  formation  of  satellites.  The  direct  collocation  method  will  be  discussed 
in  the  next  section. 

2.2.4  Splinter  Activities. 

Perhaps  one  of  the  most  interesting  aspects  of  formation  maneuvers  are  the  splin¬ 
ter  activities,  because  they  are  the  reason  for  the  mission.  Missions  could  include  an 
ordnance  drop,  a  battle  damage  assessment,  target  tracking,  or  some  form  of  intelli¬ 
gence,  surveillance  or  reconnaissance.  Most  of  the  methods  discussed  in  this  section 
include  the  outer-loop  category  of  sequential  path  planning  control.  However,  the 
last  two  methods  discussed,  direct  and  indirect  methods  are  used  to  calculate  a  set  of 
control  inputs  that  provide  a  full  outer-loop  path  of  the  optimal  mission  trajectory. 

Swarming  and  emergent  behavior  were  mentioned  in  the  section  on  formation 
keeping,  and  they  have  relevance  in  performing  splinter  activities  as  well.  Kovacina 
[42]  argues  that  traditional  control  techniques  do  not  provide  the  “flexibility  and 
efficiency  needed  to  meet  the  commercial  and  military  demands  placed  upon  UAV 
swarms,”  and  as  such  develops  a  control  algorithm  and  demonstrates  the  effectiveness 
via  simulation  of  a  swarm  of  vehicles  to  search  and  map  a  chemical  cloud.  In  this 
emergent  behavior  control  algorithm,  individual  agents  gather  information  in  their 
local  environment  and  share  that  information  with  the  rest  of  the  swarm.  As  infor¬ 
mation  increases,  the  swarm  will  flock  to  the  area  where  help  is  needed  to  accomplish 
the  goal.  With  this  behavior,  a  task  that  is  not  possible  by  any  one  agent  can  be 
accomplished  by  a  group  of  agents. 

Examples  of  works  using  emergent  behavior  techniques  include  Russell’s  use  of  a 
Genetic  Algorithm  (GA)  for  UAV  routing  [56].  Duan  [57]  demonstrated  the  use  of 
a  PSO  to  control  a  group  of  UAVs  to  a  target  in  minimum  time,  then  combined  the 
PSO  and  GA  into  a  hybrid  technique  in  which  each  method  was  run  in  parallel.  On 
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each  iteration,  the  method  that  provided  the  best  solution  was  used  in  the  subsequent 
iteration.  Other  biology  inspired  algorithms  include  Ant  Colony  Optimization  [58] 
and  Bee  Colony  Optimization  [59],  inspired  by  the  behavior  of  colonies  of  ants  and 
hives  of  bees,  respectively. 

The  research  herein  led  to  the  use  of  a  hybrid  optimization  technique  to  include 
the  use  of  a  PSO  algorithm  to  supply  an  initial  guess  to  gradient-based  Nonlinear 
Program  (NLP).  Additional  discussion  on  what  led  to  the  use  of  a  hybrid  optimization 
technique  and  the  PSO  algorithm  are  discussed  in  Sections  2.3  and  2.4,  respectively. 

When  considering  groups  or  swarms  of  vehicles,  multi- vehicle  collaborative  control 
is  an  area  of  interest  in  the  technical  community.  Reynolds  [60]  developed  three 
naturally  observed  rules  for  simulating  flocking  behavior:  (1)  collision  avoidance,  (2) 
velocity  matching,  and  (3)  flock  centering.  Various  works  have  used  these  flocking 
behavior  rules  as  a  baseline  for  controlling  flocks  or  swarms  of  vehicles  and  have 
varied  the  rules  in  order  to  solve  assorted  problems.  Lambach  [61]  modifies  Reynolds’ 
rules  in  order  to  determine  a  flight  configuration  that  would  reduce  drag  and  fuel 
consumption,  which  leads  to  increased  system  range  and  endurance.  Kaiser  [62] 
performs  ISR  on  a  point  target  which  added  nine  additional  rules  to  the  three  basic 
Reynolds’  rules. 

Rabbath  [63]  offers  a  review  of  several  papers  covering  the  topic,  including  Chan¬ 
dler  [64]  who  rendezvoused  three  UAVs  from  physically  separate  locations  to  a  final 
location  at  the  same  time.  Each  vehicle  plans  its  own  path  with  an  approximate  op¬ 
timal  path  that  is  refined  according  to  maneuver  constraints.  A  sensitivity  function, 
communicated  from  each  UAV,  is  calculated  by  a  coordinating  agent.  The  coordinat¬ 
ing  agent  broadcasts  the  optimal  time  of  arrival  for  all  UAVs.  The  UAVs  then  adjust 
their  own  path  according  to  the  broadcast  time  [65]. 

Shima  [66]  establishes  a  scenario  in  which  a  group  of  micro-UAVs  are  released  from 
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a  small  UAV  in  order  to  investigate  targets  in  an  urban  environment.  The  rnicro- 
UAVs  know  the  targets,  but  upon  release  have  no  coordination  or  collaboration  on 
how  to  attain  this  mission.  Multi-vehicle  collaborative  control  gathers  information 
on  the  targets  and  available  micro- vehicles  and  assigns  a  target  to  each  micro-UAV 
according  to  a  computed  optimal  path  for  each  vehicle  to  accomplish  the  overall 
mission. 

Shima  [66]  and  Chandler  [64]  utilize  a  centralized  planner  to  gather  information 
and  send  out  tasks  to  the  individual  UAVs.  Ryan  [67]  utilizes  a  decentralized  structure 
to  control  a  group  of  UAVs  to  patrol  an  area  searching  for  intruders  while  avoiding 
constraints  in  the  grid  space.  These  vehicles  collaborate  in  real-time  based  on  shared 
information  and  their  own  local  information. 

Alighanbari  [68]  uses  Robust  Decentralized  Task  Assignment  (RDTA)  [69]  in 
which  a  solution  is  determined  in  two  phases.  In  phase  1,  each  vehicle  computes 
a  set  of  candidate  scenarios  for  itself  and  sends  the  information  out.  This,  in  essence, 
is  an  initial  guess  for  the  next  phase.  In  phase  2,  each  vehicle  computes  its  own 
scenario  based  on  the  set  of  candidate  scenarios.  This  cuts  down  the  computation 
time  and  high  volume  of  communication  data  significantly.  This  method  is  reliant  on 
a  good  set  of  candidate  scenarios  computed  in  the  first  phase. 

Garcia  [29]  uses  artificial  potential  functions  to  control  a  splinter  activity  in  which 
three  UAVs  are  commanded  to  ‘hover’  around  a  hot-spot  until  their  manned  lead  is 
deemed  a  safe  distance  from  the  hot-spot,  then  rejoin  the  formation. 

The  next  set  of  techniques  are  gradient-based  numerical  methods  for  solving  opti¬ 
mal  control  problems  (discussed  further  in  Section  2.3).  Furukawa  [70]  uses  indirect 
methods  to  minimize  time  to  a  final  formation.  In  the  indirect  method,  the  cost, 
constraints  and  boundary  conditions  are  established.  The  cost  function  is  then  aug¬ 
mented  with  the  constraints  and  boundary  conditions  using  Lagrange  multipliers. 
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Using  the  Calculus  of  Variations,  a  set  of  first-order  necessary  conditions  for  optimal¬ 
ity  are  established  [71] .  These  first-order  necessary  conditions  can  be  discretized  and 
solved  using  a  Nonlinear  Program  (NLP)  solver  such  as  MATLABs  fmincon  in  order 
to  obtain  costates,  optimal  control  and  the  associated  optimal  states. 

Jorris  [72],  Masternak  [73],  and  Smith  [8]  used  direct  orthogonal  collocation  as 
a  method  for  transcribing  an  optimal  control  problem  into  an  NLP.  This  method 
approximates  the  states  and  controls  with  interpolating  polynomials,  where  in  order 
to  minimize  the  interpolation  error  between  the  points,  the  interpolation  conditions 
are  enforced  at  the  roots  or  extrema  of  orthogonal  polynomials.  These  approximations 
are  substituted  into  the  optimal  control  problem  directly,  rather  than  the  necessary 
conditions  for  optimality,  resulting  in  an  NLP.  Using  commercially  available  solvers 
such  as  SNOPT  [74]  or  IP  OPT  [75],  the  NLP  is  solved  using  a  gradient-based  search 
method  like  sequential  quadratic  programming  (SQP)  or  by  using  an  interior  point 
method.  More  on  the  use  of  direct  orthogonal  collocation  techniques  for  solving 
optimal  control  problems  can  be  found  in  Benson  [76]  and  Huntington  [77]. 

Jorris’  [72]  candidate  scenario  for  the  global  strike  problem  was  similar  to  the  loyal 
wingman  scenario  in  which  a  UAV  begins  with  a  fixed  set  of  boundary  conditions  and 
traverses  intermediate  waypoints  on  route  to  a  final  destination.  No-fly  zones  or  path 
constraints  were  established  which  the  vehicle  must  avoid  and  multiple  cost  functions 
were  evaluated,  including  minimize  time  and  minimize  control.  Masternak  [73]  later 
solved  a  modified  version  of  the  global  strike  scenario  including  multiple  objectives, 
no-fly  or  keep  out  zones,  and  he  proposed  direct  collocation  as  a  method  for  solving 
the  optimal  control  problem. 

Smith’s  [8]  scenario  proposed  a  UAV,  operating  in  the  National  Airspace,  flying  a 
precomputed  mission  path,  when  sensors  detect  an  intruder  aircraft.  The  UAV  must 
use  the  sensor  information  to  estimate  the  intruder’s  future  state,  model  the  future 
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state  as  a  keep-out  zone  and  successfully  avoid  the  aircraft  while  returning  to  the 
original  flight  path,  optimized  with  respect  to  various  objective  functions.  Smith’s 
scenario  must  account  for  dynamic,  non-deterministic  intruders  that  “pop-up”  during 
execution  of  the  precomputed  mission  plan,  similar  to  the  loyal  wingman  candidate 
scenario  as  defined  herein. 

2.2.5  Research  Approach. 

The  previous  four  subsections  established  multiple  methods  for  controlling  UAVs 
and  groups  of  UAVs  throughout  the  four  attributes  of  formation  flight.  The  loyal 
wingman  seeks  an  outer-loop  control  technique  to  compute  an  initial  optimal  mission 
path  as  well  as  computing  a  trajectory  re-plan  due  to  changes  in  the  mission  environ¬ 
ment.  Therefore,  outer-loop  control  techniques  which  continually  define  a  path  based 
on  high  frequency  tracking  updates  will  not  be  utilized. 

In  distinguishing  between  direct  and  indirect  methods  for  solving  the  optimal  con¬ 
trol  problem,  the  use  of  direct  orthogonal  collocation  in  combination  with  the  GPOPS 
II  [78]  MATLAB®  [79]  based  software  allows  for  rapid  solutions  to  optimal  control 
problems,  which  addresses  the  near  real-time  optimal  control  that  is  necessary  in  the 
loyal  wingman  candidate  scenario.  The  similarities  of  the  loyal  wingman  scenario  to 
works  discussed  in  Section  2.2.4  reveal  the  usefulness  of  this  method.  Therefore,  this 
work  proposes  the  use  of  the  rapid  and  accurate  direct  orthogonal  collocation  method 
to  solve  the  loyal  wingman  optimal  control  problem  as  defined  herein  as  a  unique  and 
original  contribution  to  the  scientific  and  engineering  body  of  knowledge. 

2.3  Nonconvex  Optimal  Control 

Kirk  [71]  provides  a  definition  of  a  convex  set  and  a  method  to  check  the  convexity 
of  a  function.  If  the  function  is  convex,  then  a  solution  to  the  optimal  control  problem 
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is  guaranteed  to  be  the  globally  optimal  solution.  Unfortunately,  many  optimal  con¬ 
trol  problems  of  interest  are  nonlinear  and  the  solution  therefore  is  considered  only 
locally  optimal. 

Highly  nonlinear,  nonconvex  problems  have  a  large  number  of  locally  optimal 
solutions.  The  challenge  is  to  use  a  solution  method  that  is  most  likely  to  converge  on 
the  best  locally  optimal  solution.  The  direct  orthogonal  collocation  method  chosen  to 
solve  the  loyal  wingman  problem  uses  a  gradient-based  numerical  method  for  solving 
the  NLP,  which  requires  a  user  supplied  initial  guess.  Jodeh  [80]  used  a  straight  line 
trajectory  as  an  initial  guess  with  only  a  fraction  of  the  desired  discretized  points, 
then  used  the  low-fidelity  solution  as  an  initial  guess  into  a  higher-fidelity  model. 
Smith  [8]  and  Suplisson  [81]  have  a  continually  updating  solution  where  previous 
solutions  are  used  as  an  initial  guess  into  the  new  update,  but  still  required  a  method 
to  determine  the  first  solution. 

There  is  a  tendency  for  NLP  optimizers  to  converge  in  the  region  of  the  initial 
guess  [82],  therefore,  a  method  is  desired  for  supplying  an  initial  guess  for  the  loyal 
wingman  solution  that  is  in  the  region  of  the  best  locally  optimal  solution. 

Rao  [83]  broadly  categorized  numerical  methods  for  solving  optimal  control  prob¬ 
lems  into  gradient  and  heuristic  methods.  Gradient  methods  utilize  derivative  in¬ 
formation  provided  in  the  problem  formulation  to  search  deterministically  for  the 
optimal  solution.  Gradient  methods  are  local  optimization  methods,  meaning  the 
results  converge  to  locally  optimal  solutions.  Conversely,  heuristic  methods  begin 
with  a  set  of  possible  solutions  and  use  a  stochastic  search  to  continually  update  and 
adapt  the  initial  set  of  solutions  until  an  optimal  solution  is  found.  Contrary  to  the 
gradient-based  methods  which  converge  to  locally  optimal  solutions,  heuristic  meth¬ 
ods  are  a  global  technique.  Throughout  the  work  herein  when  using  the  phrase  global 
region  or  global  nature  of  the  PSO,  this  refers  to  the  good  possibility  that  the  PSO 


provides  to  move  toward  the  best  locally  optimal  solution  within  the  space  that  is 
searched,  without  guarantee  that  the  PSO  will  converge.  Examples  of  heuristic  meth¬ 
ods  include  genetic  algorithms,  particle  swarm  optimization,  ant  colony  optimization 
[84],  differential  evolution  [85]  and  simulated  annealing  [86]. 

Conway’s  [87]  survey  suggested  the  best  method  for  solving  optimal  control  prob¬ 
lems  was  either  heuristic  algorithms  alone  or  heuristic  algorithms  in  combination  with 
transcription  techniques  which  utilize  the  gradient-based  NLP.  Englander  [88]  and 
Chilan  [89]  used  a  heuristic  technique  to  generate  the  outer-loop  solution.  Englander 
continued  to  use  a  heuristic  technique  to  generate  the  inner-loop  solution,  while  Chi¬ 
lan  used  a  gradient-based  technique  for  the  inner-loop  solution.  Showalter  [90]  and 
Vinko  [91]  used  hybrid  heuristic  optimal  control  techniques  for  space-based  applica¬ 
tions.  Modares  [92]  performed  an  experiment  comparing  various  hybrid  techniques 
and  concluded  that  a  modified  heuristic  particle  swarm  optimization  combined  with  a 
gradient-based  sequential  quadratic  program  is  robust  and  accurate  when  compared 
with  other  methods.  Additional  works  [82,  93,  94,  95,  96]  have  used  a  hybrid  PSO 
with  gradient-based  NLP. 

The  work  herein  applies  a  hybrid  optimization  technique  that  exploits  the  speed 
and  global  nature  of  a  heuristic  PSO  to  generate  an  initial  guess  to  a  gradient-based 
NLP,  which  exploits  the  speed  and  accuracy  of  direct  orthogonal  collocation  (DOC) 
to  provide  a  rapid,  feasible  solution  to  the  loyal  wingman  intermediate-target  optimal 
control  problem. 

2.4  The  Particle  Swarm  Optimization 

2.4.1  Particle  Swarm  Optimization  Background. 

The  Particle  Swarm  Optimization  (PSO)  was  introduced  by  Kennedy  and  Eber- 
hart  [97] .  Based  on  the  behavior  of  flocks  (or  swarms)  of  birds,  each  particle,  which 
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is  in  general  any  scalar  or  vector  quantity  the  user  desires,  is  flown  through  space 
in  search  of  the  optimal  solution  with  respect  to  a  defined  fitness  or  cost  function. 
The  basic  PSO  algorithm  is  two  lines  of  code  which  first  updates  its  velocity  (the 
rate  at  which  a  solution  changes)  based  on  its  previous  velocity  (often  termed  the 
‘inertial’  component)  and  the  behavior  of  the  flock  (or  swarm);  and  then  uses  the 
updated  velocity  to  update  the  particle’s  position2.  The  algorithm  is  seeded  with  an 
initial  particle  position  and  velocity  and  iterates  through  the  following  two  formulas 
to  update  the  particle’s  velocity,  v(i),  and  position,  x(i), 

v(i  +  1)  =  aw{i  +  1)  +  hri(x(i)  -  pL)  +  b2r2(x(i )  -  pG)  (2.1a) 

x{i  +  1)  =  x{i)  +  v(i  +  1).  (2.1b) 

The  updated  velocity,  Equation  2.1a,  of  an  individual  particle  is  achieved  through  a 
three-part  formula.  The  inertial  component  provides  the  weighting,  U\  for  the  parti¬ 
cle’s  previously  computed  velocity.  The  first  ‘social’  component  provides  weighting, 
bi,  on  the  best  solution  found  so  far  for  each  individual  particle.  On  any  given  iter¬ 
ation,  if  the  cost  with  v(i)  and  x(i)  is  better  than  that  for  pL,  then  that  particle’s 
‘local  best’  is  updated.  The  second  social  component,  b2,  provides  a  weighting  on  the 
best  solution  found  for  the  entire  ‘swarm’  (all  particles).  On  any  given  iteration,  if  a 
better  solution  is  found  for  the  swarm,  then  the  global  best  is  updated.  Each  of  the 
social  components  have  a  deterministic  parameter  weighting  {b\  and  b2)  as  well  as  a 
nondeterministic  evenly  distributed  parameter  weighting,  r\)r2  G  [0, 1]. 

The  basic  PSO  algorithm  is  only  two  lines  of  code,  but  a  review  of  literature  reveals 
there  are  many  methods  for  varying  the  PSO  algorithm  for  individual  applications. 

2‘position’  and  ‘velocity’  when  referenced  to  discussion  on  PSO  refer  to  the  current  value  of  the 
particle’s  position  and  the  rate  the  the  particle’s  position  is  changing 
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2.4.2  PSO  Parameters. 


Convergence  characteristics  of  the  PSO  algorithm  are  dependent  both  on  the 
application  as  well  as  the  choice  of  parameters.  Based  on  a  mathematically  complex 
work  done  by  Clerc  and  Kennedy  [98] ,  Trclea  performed  an  experiment  revealing  these 
convergence  tendencies  [99].  Clerc  [100]  suggested  scaling  the  update  to  velocity  by  a 
constriction  factor,  K,  as  necessary  to  ensure  convergence.  The  use  of  a  constriction 
factor  weights  the  inertial  component,  eliminating  the  need  to  identify  a  weighting  (oi) 
for  the  inertial  component.  Equations  2.1a  and  2.1b  are  updated  with  a  constriction 
in  Equation  2.2. 


v(i  +  1)  =  K[v(i)  +  6^1  (x(i)  -  pi)  +  b2r2(x(i )  -  pg)\  (2.2) 


The  formula  for  the  constriction  factor  can  be  mathematically  complex  such  as  the 
one  proposed  by  Lu  [101]  which  allows  trigonometric  variation  of  the  constriction 
factor  such  that  a  higher  constriction  during  early  iterations  allows  for  a  more  global 
search,  while  a  smaller  constriction  value  in  later  iterations  focused  search  in  the  local 
regions.  Clerc  [100]  recommended  a  simplified  version  and  will  be  used  in  the  loyal 
wingman  application, 


K  = 


(2,3) 


4>  -  vVJ  -  4<A’ 

with  (j)  =  bi+b2l  (j)  >  4.  Eberhart  and  Shi  [102]  determined  that  the  constriction  factor 
should  be  used  in  combination  with  a  max  value  assigned  to  velocity,  vmax  =  xmax, 
so  v  is  bound  by  xmax  at  each  iteration. 


2.4.3  PSO  Seeds. 

All  works  studied  used  random  methods  for  producing  the  initial  particles  [54,  57, 
93,  94,  103,  104,  105,  106,  107,  108]  to  seed  the  algorithm. 
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2.4.4  PSO  Constraints. 


Hu  and  Eberhart  [109]  suggest  initializing  the  algorithm  with  a  set  of  seeds  that 
meet  constraints,  then  at  each  iteration  only  keeping  particles  that  continue  to  meet 
constraints.  Pontani  [105]  suggests  that  any  particle  that  does  not  meet  constraints 
at  each  iteration  should  have  its  velocity  component  set  to  0,  essentially  throwing 
out  that  iteration.  Mazhoud  [107]  provides  a  review  of  methods  for  handling  con¬ 
straints,  including  penalty  functions  [110],  self-adaptive  velocity  [111]  and  stagnation 
determination  [112], 

2.4.5  PSO  Model  Order. 

Duan  [54,  106,  57]  used  the  PSO  to  choose  the  elements  of  a  vector  that  provide 
the  optimal  solution  with  respect  to  the  established  cost  function.  Others,  such 
as  Conway  suggest  computation  time  efficiency  by  modeling  the  control  input  as  a 
polynomial  and  using  the  PSO  to  solve  for  the  polynomial  coefficients.  Zhuang  [93] 
shows  that  his  problem  has  a  property  of  differential  flatness  which  allows  a  reduction 
in  the  number  of  variables. 

2.4.6  PSO  Conclusion. 

The  work  herein  will  use  these  ideas  to  tailor  a  PSO  algorithm  to  the  loyal  wingman 
scenario. 

2.5  Models 

The  literature  review  to  this  point  has  focused  on  determining  a  method  for  solving 
the  optimal  control  problem.  The  purpose  of  this  section  is  to  present  various  aspects 
of  the  problem  that  must  be  modeled,  covering  threats,  UAV  dynamics,  and  multiple- 
target  (waypoint)  missions. 
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2.5.1  Loyal  Wingman. 


A  state  model  must  be  determined  for  the  loyal  wingman  that  will  be  used  as 
dynamic  constraints  in  the  optimal  control  problem.  Carter  [7]  chose  a  high-fidelity 
12-state,  6  Degree-of-Freedom  (DOF)  model  because  of  the  highly  dynamic  and  crit¬ 
ical  maneuvering  capabilities  necessary  in  high  performance  fighter  air  vehicles.  This 
high-fidelity  model  is  computationally  expensive,  but  is  critical  to  obtaining  accurate 
results.  If  the  fidelity  were  not  necessary,  then  it  would  be  resourceful  to  utilize  a 
reduced-order  model  to  increase  computational  efficiency.  Smith  chose  a  5-sate,  3-D, 
2-control  model  consistent  with  the  AFRL  program  document  [113]  where  the  con¬ 
trols  are  vertical  acceleration,  Nz,  and  bank  angle,  ji.  The  work  herein  uses  the  same 
model,  but  in  order  to  avoid  ‘bang-bang’  control,  Smith’s  controls  are  made  states 
and  their  rates  are  made  the  controls, 


x(t) 

y(t) 

z(t) 

7(f) 

x(t) 

Nz(t ) 

Mt) 


V  cos  7 (t)  cos  x(t) 

V  cos  7 (t)  siny(f) 

V  sin  7(f) 

Nzg  cos  nit)  —  g  cos  7 (t) 
V 

Nzg  sin  /x(t) 

V  cos  7(f) 

Ul(t) 

U2(t), 


(2.4) 


where  x,  y,  z  represent  position  coordinates,  7  is  flight  path  angle,  y  is  heading,  Nz  is 
vertical  acceleration,  //  is  bank  angle,  controls  are  vertical  acceleration  rate,  u\  =  Nz 
and  bank  angle  rate,  u2  =  fi.  Velocity,  V  is  assumed  constant  and  g  is  the  gravitational 
constant. 
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2.5.2  Threats. 


The  loyal  wingman  will  operate  in  a  hostile  environment  where,  if  the  loyal  wing- 
man  is  detected,  the  chances  are  high  that  the  mission  will  not  be  completed.  This 
section  reviews  how  other  literary  works  model  threats. 

2. 5. 2.1  Static  Threat  Models. 

Garcia  [29]  uses  artificial  potential  functions  as  a  means  of  controlling  the  UAVs. 
The  obstacles  were  similarly  modeled  using  vector  fields  with  weighted  sigmoid  func¬ 
tions,  which  works  well  for  static  and  predictably  slow-moving  obstacles.  Jorris  mod¬ 
els  two  no-fly  zones  as  cylinders  that  extended  infinitely  high  [72],  Weihua  [45,  114] 
uses  simple  circles  to  model  small  obstacles,  but  if  the  obstacle  is  determined  large, 
then  Weihua  uses  polytopes  [115]  to  model  the  obstacles,  because  of  the  variety  of 
shapes  that  can  be  generated.  Given  a  chosen  shape,  a  hexagon  for  example,  each 
of  the  6  edges  are  assigned  a  1  for  its  safe  side  and  a  0  for  its  unsafe  side.  Each 
discretized  point  of  the  full  trajectory  is  evaluated  against  each  edge  line  of  the  shape 
to  determine  whether  the  trajectory  is  on  the  “safe”  or  “unsafe”  side  of  the  line. 
For  an  individual  discretized  point  of  the  trajectory,  its  binary  “safe”  and  “unsafe” 
values  are  summed  and  if  the  value  is  greater  than  or  equal  to  1,  then  the  threat  is 
successfully  avoided  at  that  point  in  the  trajectory. 

2. 5. 2. 2  Modeling  a  Changing  Threat  Region. 

Campbell  [116]  models  dynamic,  non-deterministic  threats  by  first  modeling  a 
constraint  as  an  ellipsoid  then  adapting  the  ellipsoid  by  developing  an  extended  set 
membership  filter.  This  filter  linearizes  the  nonlinear  model  at  each  discrete  time 
step,  then  adds  the  remainder  as  uncertainty,  delivering  an  ellipsoid  set.  Campbell 
further  tightens  the  estimate  and  approximates  the  intersection  of  the  two  ellipsoid 
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estimates  as  a  polyhedron.  A  final  ellipsoid  bounds  the  polyhedron  and  is  used  as  the 
obstacle  that  must  be  avoided. 

Smith  [8]  uses  a  particle  filter  to  estimate  future  location  of  a  dynamic  intruder  ve¬ 
hicle  for  avoidance.  The  obstacle’s  particle  distribution  is  modeled  using  a  Minimum 
Volume  Enclosing  Ellipsoid  (MVEE)  [117,  118]  and  projected  along  a  trajectory.  The 
set  of  ellipsoids  created  by  the  particle  filter  and  MVEE  algorithm  are  then  interpo¬ 
lated  into  a  single  keep-out  region  using  the  SLIMVEE  [8]  algorithm,  adapted  from 
Schoemake’s  Slerp  [119]  algorithm. 

Smith  additionally  used  superquadrics  as  described  by  Barr  [14].  The  product  of 
a  half  ellipsoid  in  the  (x,  y )  plane  with  the  full  ellipsoid  in  the  plane  orthogonal  to 
the  (x,  y )  plane  can  be  written  as 


where 


aicos£l(77)  cos£2(a;) 
a2  cos£1  (77)  sin£2  (co)  , 
a3  sin£l  (77) 


(2.5) 


—  7t/2  <  77  <  7t/2 
—7 r  <  co  <  7 r. 


An  implicit  equation  can  be  derived  from  the  above  Equation  2.5 


(2.6a) 

(2.6b) 


F(x,  y,  z,  oq,  a2,  a3,  £1}  e2)  = 


(2.7) 


where  at  is  a  principal  axis,  £1  determines  the  shape  of  the  superellipsoid  cross  sec¬ 
tion  in  a  plane  perpendicular  to  the  (x,  y)  plane  and  e2  determines  the  shape  of  the 
superellipsoid  cross  section  parallel  to  the  (x,y)  plane.  Equation  2.7  is  the  general 
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formula  for  modeling  a  class  of  superquadrics,  called  the  inside-outside  function.  For 
a  given  point  (. x,y,z ),  if  F  <  1,  the  point  is  in  the  interior  of  the  geometric  shape, 
if  F  >  1,  the  point  is  exterior  to  the  geometric  shape  and  if  F  —  1,  the  point  is 
on  the  surface  of  the  geometric  shape.  By  varying  the  principal  axes  values  a*  as 
well  as  the  et  values,  various  superquadric  shapes  may  be  generated  such  as  spheres, 
supercllipsoids,  and  cylinders  [120].  Smith  varied  the  equation  slightly  to  use  only  a 
single  shaping  parameter  to  model  spheres,  ellipsoids  and  cylindrical  superquadrics, 


-^(x,  Vi  ^ :  ^1?  @ 2 j  ^3? 


(2.8) 


2. 5. 2. 3  Minimize  Threat  Exposure. 

The  desire  is  not  always  a  strict  keep-out  region,  but  rather  to  minimize  expo¬ 
sure.  Gaunt  [121]  calculated  the  Signal  to  Noise  Ratio  (SNR)  and  then  developed 
a  formulation  that  can  be  used  inside  a  cost  functional  to  minimize  exposure.  His 
formulation  forces  the  cost  functional  to  equal  1  when  the  aircraft  is  at  the  radar 
detection  limit  and  then  grow  quickly  as  the  vehicle  traverses  deeper  into  the  threat 
area.  This  method  allows  the  threat  areas  to  be  modeled  in  the  cost  function  as  a 
continuous,  differentiable,  conditional  constraint.  Jodeh  [80]  uses  a  similar  technique, 
modeling  a  conditional  constraint  in  the  cost  function  with  the  use  of  a  sigmoid  or 
barrier  function.  Smith  [8]  used  a  sigmoid  function  as  well  to  establish  an  inequal¬ 
ity  constraint  in  his  Sense  and  Avoid  (SAA)  optimal  control  problem.  The  keep-out 
zone  for  two  aircraft  approaching  one  another  is  the  Federal  Aviation  Administra¬ 
tion  (FAA)  required  minimum  vertical  and  horizontal  separation  distances  specified 
for  the  National  Air  Space  (NAS).  The  need  to  avoid  both  a  vertical  and  horizontal 
separation  distance  translates  to  the  need  to  avoid  multiple  threats.  Smith  evaluated 
the  difference  between  summing  the  sigmoid  of  multiple  threat  zones  and  taking  the 
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product  of  the  sigmoid  of  multiple  threat  zones  and  determines  the  sigmoid  product 
method  as  superior  [8]. 


2. 5. 2. 4  Dynamic,  Non-Deterministic  Threat  Models. 

Threats  may  include  air  or  ground  threats.  Li  and  Jilkov  [122]  survey  various  dy¬ 
namics  by  which  to  model  maneuvering  intruder  aircraft.  By  determining  the  appro¬ 
priate  assumptions,  such  as  whether  the  vehicle  is  maneuvering  or  non-maneuvering, 
aggressive  or  non-aggressive,  a  2-D  or  a  3-D  dynamics  model,  and  the  way  noise 
is  modeled  in  the  system,  there  are  various  dynamic  models  from  which  to  choose. 
Smith  [8]  chose  a  Singer  [123]  acceleration  model,  which  assumes  non-aggressive,  un¬ 
coordinated,  maneuvering  aircraft  with  time  auto-correlation,  a. 


x  = 


0  1  0 

0  0  1 

0  0  —a 


(2.9) 


Stankovic  [11]  develops  a  single  axis  model  to  control  a  platoon  of  vehicles.  Assump¬ 
tions  in  the  loyal  wingman  problem  will  result  in  a  simple  linear  model  that  combines 
these  works. 


2. 5. 2. 5  Sensor  Measurement  Updates. 

Maroney  [124]  investigates  the  concept  that  there  may  be  a  variety  of  sensor 
combinations  that  are  appropriate  for  various  UAVs  according  to  their  capability. 
Chen  [125]  integrates  the  most  appropriate  set  of  sensors  for  the  sense  and  avoid 
mission  application.  The  work  herein  assumes  that  the  appropriate  sensor  and  mea¬ 
surement  integration  techniques  are  available  for  the  loyal  wingman  to  appropriately 
track  and  avoid  threats  in  the  air  and  on  the  ground. 
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2. 5. 2. 6  Possible  Threats. 


The  research  sponsor  provided  a  list  of  threats  that  may  be  applicable  to  the  loyal 
wingman  problem.  Table  2. 5. 2. 6  lists  threats  and  information  necessary  to  model 
them  as  superquadrics.  In  addition,  the  3-D  contour  of  the  superquadric  threat  keep- 
out  regions  are  represented  in  Figure  2.1.  These  images  are  provided  here  because 
graphical  results  of  threats  modeled  in  3-D  in  subsequent  chapters  are  displayed  on 
a  2-D  image  where  it  may  be  difficult  to  see  a  loyal  wingman  trajectory  that  avoids 
the  threat  using  the  altitude  dimension. 


Table  2.1.  Possible  Loyal  Wingman  Threats 


Type 

Range  (km) 

Altitude  (km) 

3D  Shape 

Surface  to  Air  Missile  (SAM) 

25 

25 

Sphere 

Anti-Aircraft  Artillery  (AAA) 

18 

15 

Ellipsoid 

Networked  Radar 

180 

15 

Ellipsoid 

Manpad 

6 

3 

Ellipsoid 

Electro-Optical  Sensor 

25 

15 

Ellipsoid 

Weather 

20 

15 

Cylinder 

Threat  on  Road 

100x20 

20 

Cylinder 

2.5.3  Multiple- Target  Mission. 

The  loyal  wingman  candidate  scenario  identifies  the  requirement  to  rendezvous 
at  multiple  mission  targets  on  route  to  a  fixed  final  time  and  location.  Beard  [126] 
models  threats  as  Voronoi  polygons  which  must  be  avoided,  then  identifies  waypoints 
at  the  corners  of  the  polygons.  The  path  planner  then  traverses  the  edges  of  the 
polygon  from  one  waypoint  to  another.  Lee  [127]  similarly  has  a  UAV  track  a  ground 
vehicle  using  waypoint  path  planning  by  establishing  different  algorithms  based  on 
the  relative  speed  of  the  UAV  and  ground  vehicle. 

Benson  specifies  that  there  are  a  number  of  conditions  for  which  the  direct  or¬ 
thogonal  collocation  method  is  best  solved  using  multiple  phases,  one  of  which  is  an 
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Altitude  (km)  M  Altitude  (km) 


(b)  Estimated  Threat  Region  Extended  Along  Road 


Figure  2.1.  3-D  Threat  Keep-Out  Regions  Using  Superquadrics 
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interior  point  constraint  such  as  a  waypoint  [128].  Jorris  [72],  using  direct  orthogonal 
collocation,  showed  the  need  to  put  a  phase  break  at  the  constraint. 

2.6  Stochastic  Estimation 

The  current  chapter  proposed  both  optimal  control  and  stochastic  estimation  tech¬ 
niques  to  solve  the  loyal  wingman  problem  as  defined  herein.  This  section  examines 
techniques  from  the  literature  that  may  be  used  to  estimate  moving,  non-detcrministic 
threats.  The  choice  of  estimation  tool  will  be  dependent  on  assumptions  and  user 
desires,  including  linearity  of  the  model,  estimation  probability  distribution,  and  com¬ 
putational  efficiency. 

2.6.1  Kalman  Filter. 

The  Kalman  filter  was  first  introduced  in  a  paper  by  R.E.  Kalman  in  1960  [129]. 
Around  the  same  time  of  the  introduction  of  this  discrete-data  linear  filter  technique, 
advances  in  digital  computer  technology  made  it  possible  to  implement  real-time 
recursive  estimating  solutions.  Brown  and  Hwang  provide  an  example  Kalman  filter 
algorithm  [130].  The  key  attributes  of  the  Kalman  filter  are  the  use  of  a  linear  model 
for  propagating  states  and  the  assumption  that  the  probability  distribution  of  the 
estimate  is  Gaussian.  Additionally,  the  simplicity  of  the  Kalman  filter  means  the 
computational  complexity  is  low  when  compared  to  particle  based  methods  such  as 
the  unscented  Kalman  filter  and  the  particle  filter. 

Shima  and  Rasmussen  [131]  demonstrate  a  control  technique  for  multiple  vehicle 
collaborative  control  using  the  Kalman  filter.  In  their  paper,  each  UAV  runs  multiple 
filters  in  parallel  estimating  its  own  states  as  well  as  running  filters  on  its  own  states 
as  viewed  by  other  members  of  the  group.  Singer  [123]  demonstrates  the  use  of  the 
Kalman  filter  for  the  specific  purpose  of  tracking  maneuvering  targets.  This  method 
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allows  rapid,  a  priori  estimates  of  tracking  performance  by  sensors  providing  any 
combination  of  range,  bearing,  and  elevation  measurements. 

2.6.2  Extended  Kalman  Filter. 

The  Extended  Kalman  Filter  (EKF)  is  a  variation  of  the  Kalman  filter  that  allows 
for  recursive  estimation  of  measurement  and  state  propagation  functions  that  are 
nonlinear.  The  EKF  linearizes  the  measurement  and  state  propagation  functions 
about  the  estimated  trajectory  by  taking  the  partial  derivatives  of  the  measurement 
and  state  propagation  functions,  which  can  then  be  used  in  the  recursive  Kalman 
filter  equations.  Although  this  may  seem  convenient,  there  are  risks.  First,  large 
values  in  the  original  covariance  matrix  combined  with  low-noise  measurements  at 
the  first  step  will  cause  the  covariance  matrix  to  jump  from  the  original  large  value 
to  a  very  small  value.  Roundoff  errors  associated  with  computation  of  the  covariance 
matrix  can  cause  numerical  computing  issues  that  must  be  dealt  with  appropriately. 
A  potential  solution  is  to  ensure  symmetry  and  positive  definiteness  of  the  covariance 
matrix  on  the  first  step  [130]. 

The  second  issue  is  the  error  associated  with  the  linearization  of  a  nonlinear 
function,  which  takes  the  first  two  terms  of  the  Taylor  series  expansion.  In  some 
cases,  the  linearized  matrix  is  not  an  accurate  representation  of  the  original  function, 
which  means  there  is  an  error  from  the  beginning  that  is  then  propagated  indefinitely 
through  the  model,  causing  the  error  to  continually  grow  and  in  some  cases  causes 
divergence.  In  choosing  the  EKF  as  an  estimating  tool,  care  must  be  taken  to  en¬ 
sure  these  potential  problems  don’t  cause  inaccurate  estimates  for  the  problem  one 
is  trying  to  solve  [130].  Smith  [8]  compared  the  various  filters  for  a  collision  avoid¬ 
ance  estimation  problem.  His  results  showed  an  example  where  reduced  observability 
magnified  the  linearization  error  of  the  EKF. 
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2.6.3  Unscented  Kalman  Filter. 


Wan  [132]  introduces  the  Unscented  Kalman  Filter  (UKF)  as  an  alternative  to  the 
EKF  which  addresses  the  suboptimal  performance  associated  with  the  linearization 
of  the  state  propagation  and  measurement  update  functions.  The  UKF  is  accurate  to 
the  third  order  for  Gaussian  inputs  for  all  nonlinearities  and  accurate  to  the  second 
order  for  non-Gaussian  inputs.  Additionally,  the  UKF  computational  complexity  is 
of  the  same  order  as  the  EKF.  For  this  reason,  Wan  [132]  describes  the  UKF  as 
clearly  superior  to  the  EKF.  Ross  [133]  simultaneously  solved  the  optimal  control 
problem  and  the  optimal  estimation  problem  for  a  bearing-only  sensor  using  the 
UKF.  Smith  [8]  also  performed  a  comparison  of  various  models  and  showed  results 
where  a  UKF  clearly  outperformed  the  EKF.  For  instances  where  the  state  model  is 
nonlinear  and  posterior  distribution  estimates  can  be  assumed  a  Gaussian  and  Normal 
distribution,  it  is  clear  the  appropriate  estimation  technique  is  the  UKF. 

2.6.4  Particle  Filter. 

The  UKF  and  EKF  were  presented  as  options  for  estimation  when  measurement  or 
state  propagation  functions  are  nonlinear.  These  techniques  still  assume  a  Gaussian 
posterior  probability  distribution.  In  cases  where  the  posterior  probability  distri¬ 
bution  cannot  be  assumed  Gaussian,  another  estimation  technique  is  needed.  The 
particle  filter  operates  by  first  creating  a  set  of  particles,  y.  These  particles  are  then 
propagated  through  the  state  transition  function.  Because  there  is  a  non-deterministic 
input  value  associated  with  the  state  transition  matrix,  the  particles  that  propagate 
through  the  state  transition  matrix  all  have  varying  values.  This  propagation  contin¬ 
ues  until  there  is  a  measurement  update.  Each  of  the  particles  are  then  assigned  a 
weighting  based  on  a  likelihood  function.  This  function  allows  each  particle  to  eval¬ 
uate  its  likelihood  of  being  true  based  on  the  value  obtained  from  the  measurement. 
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The  probability  distribution  function  is  then  found  by  normalizing  the  weights  of  the 
particles.  An  algorithm  for  the  particle  filter  provided  by  Kim  [134]  and  examples 
of  variations  of  the  particle  filter  and  methods  and  algorithms  for  re-sampling  are 
provided  by  Arulampalam  [135]. 

The  benefit  of  particle  filters  is  they  are  robust  to  handle  nonlinear  models  and 
there  need  not  be  Gaussian  assumptions  on  the  posterior  probability  distribution. 
However,  the  expense  is  typically  a  high  computation  cost  based  on  the  number  of 
particles  that  are  used  in  the  filter.  There  are  a  number  of  examples  of  use  of  particle 
filters  for  the  control  of  UAVs  in  literature.  Conde  [136]  uses  a  particle  filter  to  predict 
the  trajectory  of  a  UAV  and  in  particular  highlights  the  importance  of  re-sampling 
in  the  results  of  the  work.  Gustafsson  [137]  demonstrates  a  technique  known  as  Rao- 
Blackwcllization  by  using  a  combination  of  a  particle  filter  and  a  Kalman  filter  for 
estimation.  He  showed  that  for  a  high-dimensioned,  27-state  problem,  24  of  the  states 
were  linear  and  could  be  estimated  using  the  Kalman  filter.  The  particle  filter  was 
then  used  to  estimate  the  three  remaining  position  states.  Combining  the  use  of  the 
filters  assured  no  loss  of  fidelity  while  significantly  reducing  the  computational  cost. 

2.7  Trajectory  Re-Planning 

The  final  aspect  of  the  loyal  wingman  optimal  control  problem  requires  a  re¬ 
view  of  techniques  that  allow  for  dynamic,  near-real-time  trajectory  re-planning.  In 
Smith’s  [8]  SAA  scenario,  a  UAV  operating  in  the  NAS  is  flying  a  precomputed 
path.  A  suite  of  sensors  provide  information  on  other  potential  aircraft  that  must 
be  avoided.  A  receding  horizon  model  predictive  controller  continually  projects  an 
image  of  the  airspace  a  given  time  in  the  future,  t^.  If  necessary,  a  path  is  created 
which  optimally  avoids  intruder  aircraft  and  returns  to  the  previously  identified  path. 
The  aircraft  then  flies  the  newly  computed  path  for  a  portion  of  the  horizon  tp  and 
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sequentially  updates  the  collision  avoidance  trajectory  every  tp.  An  inner-loop  con¬ 
troller  such  as  an  LQR  is  used  to  ‘fly’  each  tp.  There  are  challenges  with  using  an 
LQR  [138]  and  so  Jiang  [139]  uses  Neighboring  Optimal  Control  (NOC)  to  obtain  a 
solution  to  return  to  a  nominal  path.  Described  further  by  Yan  [138],  NOC  establishes 
a  set  of  perturbation  equations  linearized  around  the  nominal  trajectory,  assumes  the 
solution  in  the  form  of  polynomials  discretized  at  the  roots  of  an  orthogonal  basis  set 
and  may  then  quickly  find  the  solution  to  a  set  of  algebraic  equations.  NOC  uses  the 
same  direct  orthogonal  collocation  transcription  technique  used  in  the  loyal  wingman 
optimal  control  problem  described  herein  and  therefore  may  be  a  convenient  tool  for 
dynamically  re-planning  a  mission. 

Ground-based  entities  include  input  from  manned  operators  as  well  as  the  use 
of  computer  subsystems  such  as  the  Operations  Research  Planning  and  Utility  Sys¬ 
tem  (OPUS)  [140],  described  as  an  A-star  heuristic  search  algorithm  for  very  fast 
computation  [140].  The  Mission  Reconfigurable  Cockpit  (MRC)  Real-time  In-flight 
Planner  [141]  is  an  onboard  dynamical  re-planning  tool  which  utilizes  digital  terrain 
elevation  data  (DTED)  to  create  a  small  set  of  potential  solutions  and  chooses  the 
optimal  of  the  small  set  as  its  solution.  The  benefit  of  this  method  is  it  develops 
plausible  new  trajectories  in  near  real-time. 

Both  NOC  and  the  receding  horizon  model  predictive  controller  are  a  few  of 
the  many  tools  available  for  inner-loop  trajectory  tracking  or  sequential  trajectory 
creation  as  described  in  the  literature  [142],  These  tools,  however  assume  an  outer- 
loop  solution  has  already  been  created  and  solves  the  problem  of  either  maintaining 
that  path  or  avoiding  disturbances/threats  that  may  traverse  the  path  and  require 
continual  use  of  computation  resources  that  are  assumed  not  continuously  available 
throughout  the  mission.  OPUS  and  the  MRC  planner  both  provide  full  outer- loop 
path  re-plans,  however  the  ground-based  OPUS  requires  a  human  in  the  loop  and 
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a  high  communication  bandwidth  which  is  not  suitable  for  the  low-communication 
bandwidth  requirement  of  the  loyal  wingman  optimal  control  problem,  whereas  the 
MRC  planner  is  suboptimal. 

The  work  herein  assumes  inner-loop  control  and  sequential  path-planning  tech¬ 
niques  such  as  the  ones  mentioned  previously  are  available.  The  changes  in  the  loyal 
wingman  mission  environment  are  significant  enough  to  require  a  full  mission  re-plan 
in  near  real-time.  This  research  focuses  on  the  use  of  optimal  control  techniques  to 
rapidly  create  both  an  initial  optimal  outer-loop  mission  path  as  well  as  to  re-plan 
the  path  in  a  changing  mission  environment. 

2.8  Literature  Review  Conclusion 

Chapter  II  began  with  a  review  of  literature  related  to  MUM-T  and  loyal  wing¬ 
man  applications  and  concluded  that  solving  the  loyal  wingman  problem  as  defined 
in  Chapter  I  with  optimal  control  and  stochastic  estimation  techniques  will  provide 
valuable  contributions  to  the  scientific  and  engineering  community.  A  review  was  con¬ 
ducted  across  the  four  attributes  of  formation  flight  and  determined  direct  orthogonal 
collocation  as  a  rapid  and  accurate  optimal  solution  technique.  A  discussion  then  en¬ 
sued  on  the  locally  optimal  nature  of  gradient-based  methods  and  suggested  the  use 
of  a  hybrid  optimization  technique  in  which  a  particle  swarm  optimization  algorithm 
generates  a  solution  in  the  correct  global  region  and  is  then  provided  as  an  initial 
guess  to  the  gradient-based  NLP  optimizer.  Having  determined  the  appropriate  hy¬ 
brid  optimal  control  methodology,  a  review  was  then  conducted  on  various  models 
that  are  used  throughout  literature  for  modeling  UAV  dynamics  as  well  as  various 
ways  to  model  threats.  The  literature  review  concluded  with  a  discussion  on  stochas¬ 
tic  estimation  techniques  and  methods  used  for  near  real-time  trajectory  re-planning. 

The  next  two  chapters  will  define  the  optimal  control  methodology. 
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III.  Formulate  the  Optimal  Control  Problem 


The  purpose  of  this  chapter  is  to  answer  research  question  one  by  formulating  the 
optimal  control  problem  and  demonstrating  the  direct  orthogonal  collocation  tech¬ 
nique  using  a  reduced-order  3-state,  2-D  model.  After  formulating  the  components 
of  the  optimal  control  problem  in  Section  3.2,  Section  3.3  provides  a  concise  problem 
formulation  summary.  Results  in  this  chapter  highlight  the  importance  of  providing 
a  good  initial  guess  to  the  gradient-based  NLP  and  the  conclusion  highlights  contri¬ 
butions  that  result  from  formulating  the  optimal  control  problem.  The  usefulness  of 
a  hybrid  optimal  control  methodology,  which  utilizes  the  PSO  to  provide  an  initial 
guess  to  the  NLP,  is  demonstrated  in  the  next  chapter. 

3.1  The  Optimal  Control  Problem 

The  goal  in  an  optimal  control  problem  is  to  find  an  admissible  control, 


U(x(t),f) 


(3.1) 


that  minimizes  an  identified  objective  or  cost  functional 


J (x(t))  =  h(x(t/),  tf)  +  f  g (x(t),  u (t),t)dt 

Jt0 


(3,2) 


where 


h(x( 


(3,3) 


is  a  term  relating  to  the  cost  at  the  final  state  (terminal  cost)  and 


rtf 


g(x(t),u(t),t)dt 


't0 


(3.4) 
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is  a  term  relating  to  the  cost  along  the  path  (running  cost).  The  solution  must  also 
satisfy  dynamic  path  constraints  of  the  form, 

x(£)  =  /(x(t),  u(t),t),  (3.5) 

inequality  and  equality  constraints  along  the  path  (no-fly  zones)  of  the  form, 

c(x(t), u(t),  t)  <  0,  (3.6) 

and  boundary  conditions  of  the  form, 

x(t0)  =  xo  (3.7a) 

x(tj)=xy.  (3.7b) 

The  components  of  the  optimal  control  problem  formulation  will  now  be  described 
for  the  loyal  wingman  problem. 

3.2  Components  of  the  Loyal  Wingman  Problem  Formulation 

The  loyal  wingman  candidate  scenario  involves  a  UAV  that  must  autonomously 
compute  and  fly  to  an  intermediate  target  (waypoint)  on  route  to  a  final  rendezvous, 
while  either  completely  avoiding  threats  or  minimizing  the  risk  of  exposure  when 
threats  are  not  avoidable.  The  purpose  of  this  section  is  to  formulate  the  optimal 
control  problem  in  a  way  that  will  account  for  the  various  threat  and  boundary 
conditions  scenarios. 
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3.2.1  Dynamic  Constraints. 


The  3-state,  2-D,  point-mass  model,  which  does  not  change  throughout  the  mission 
is 


x  =  V cos'ip  (3.8a) 

y  =  V  simp  (3.8b) 

ip  =  u,  (3.8c) 

where  x  and  y  are  the  UAV’s  x-  and  y-  positions,  ip  is  heading,  u  is  the  control,  heading 
rate  in  rad/s  and  V  is  the  velocity,  held  constant.  The  3-state  vector,  {x,y,ip},  will 
be  represented  as  x. 

3.2.2  Boundary  Conditions. 

The  boundary  conditions  include  three  final  condition  scenarios: 

1.  Fixed  final  time,  tfc,  to  fixed  final  point,  x/c, 

2.  Minimize  time  to  fixed  final  point,  xjc,  and 

3.  Minimize  time  to  rendezvous  with  the  manned  lead, 

where  the  subscript  fc  indicates  the  final  condition.  Two  assumptions  are  established 
for  rendezvous  with  the  manned  lead.  First,  the  time-dependent  path  of  the  manned 
lead  is  known  a  priori.  Future  work  may  consider  stochastics  in  the  manned  lead’s 
trajectory  and  the  techniques  discussed  in  Chapter  VI  may  be  extended  for  use  in 
that  scenario.  Second,  it  is  assumed  the  loyal  wingman  needs  to  arrive  within  a  safe 
distance  of  the  manned  lead  and  other  systems  or  algorithms  will  ensure  a  collision- 
free  final  formation  rendezvous.  For  simplicity,  the  optimal  control  problem  is  feasible 
for  a  subset  of  the  general  problem  when  the  states  at  the  final  time,  tf,  of  the  loyal 
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wingman,  subscript  W,  are  equivalent  to  the  states  at  the  final  time  of  the  manned 
lead,  subscript  L,  such  that 

xw(tf)  =  xL(tf).  (3.9) 

In  order  to  satisfy  intermediate  target  constraints,  the  problem  is  broken  into 
multiple  phases,  one  break  for  each  waypoint,  a  common  technique  when  using  direct 
orthogonal  collocation  to  solve  optimal  control  problems.  The  loyal  wingman  begins 
at  the  established  initial  conditions. 

Phase  1  initial  condition  (superscripts  indicate  phase  number): 

xlv(to)  =X0,  (3.10) 


where  in  the  case  of  the  manned  rendezvous  x0  =  Xi(t0)- 
Phase  1  final  condition: 

f )  = 


(3.11) 


Phase  2  initial  condition  is  at  the  intermediate  target  location  (the  superscript  2 
indicates  phase  number): 


X\v(to)  =  xi 
Vw(t  o)  =  2/i- 

Phase  2  final  condition  is  different  for  each  scenario 


(3.12) 


1.  Fixed  final  time  to  fixed  final  point,  =  xjc,  tj  =  tfc 

2.  Minimize  time  to  fixed  final  point,  x^(t^)  =  x/c 

3.  Minimize  time  to  rendezvous  with  the  manned  lead,  x^(f^)  =  x^(fj) 

Finally,  in  a  multiple-phase  problem,  a  linkage  constraint  must  be  established  that 
ensures  the  final  time  and  states  of  phase  1  are  exactly  equal  to  the  starting  time  and 
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states  of  phase  2, 


Hv(^/)  —  xw/(^o) 


A  -  +2 

If  —  L0. 


(3.13a) 

(3.13b) 


3.2.3  Modified  Inside-Outside  Threat  Product  Function. 

Two  threat  scenarios  are  considered:  one  in  which  the  mission  may  be  accom¬ 
plished  without  risk  of  exposure  and  another  in  which  risk  of  exposure  is  unavoidable 
due  to  a  fortified  intermediate  target.  Figures  3.1  and  3.2  indicate  course  layouts, 
where  the  small  circle  indicates  the  starting  position,  the  ‘x’  indicates  the  intermedi¬ 
ate  target,  the  blue  dot  indicates  final  rendezvous,  and  the  shaded  regions  indicate 
the  threats  which  must  be  avoided. 
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Figure  3.1.  Course  Layout  with  Avoidable  Threats 


Chapter  II  highlighted  superquadrics  as  a  method  for  modeling  threats,  which 
are  convenient  for  the  loyal  wingman  application  because  of  the  small  number  of 
parameters  necessary  for  representing  2-D  and  3-D  regions,  and  thus  threat  constraint 
violation  can  be  easily  computed  using  the  2-D  version  of  the  inside-outside  function, 


n h)  = 


X\v  —  Xt 


a ~ 


+ 


Vw  ~  Ur 


Cl ,, 


(3.14) 
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Figure  3.2.  Course  Layout  with  Unavoidable  Threats 

where  h  =  [xw,  Vw,  xt,  Ut,  a®,  o,y,p]]  xw  and  yw  indicate  the  position  of  the  loyal 
wingman  along  its  trajectory;  Xt  and  Ut  indicate  the  center  point  of  the  threat;  ax 
and  ay  are  the  principal  axes  of  the  shape  modeling  the  threat;  and  p  is  a  parameter 
utilized  to  form  the  shape  of  the  superquadric,  where  p  =  2  forms  a  circle  or  ellipse. 
A  given  point,  (xw,Vw),  on  the  loyal  wingman  trajectory  is  outside  the  threat  when 
F  >  1,  therefore  the  inequality  path  constraint  is 

1  —  F(h)  <  0.  (3.15) 

For  scenarios  where  threats  are  unavoidable  because  the  target  lies  inside  a  threat 
region,  as  in  Figure  3.2,  feasible  solutions  require  that  threats  not  be  accounted  for  as 
constraints,  but  rather  as  part  of  the  cost  function.  Two  challenges  arise  associated 
with  modeling  the  inside-outside  function  as  a  running  cost.  First,  Equation  3.14 
may  grow  without  bound  which  provides  a  tendency  for  the  NLP  to  converge  on  a 
trajectory  as  far  away  from  the  threat  region  as  possible,  which  is  not  the  desire.  This 
challenge  can  be  addressed  through  a  boolean  transformation  in  which  all  values  of  F 
greater  than  1  are  set  to  1  and  all  values  of  F  less  than  1  are  set  to  0.  This,  however, 
creates  a  challenge  associated  with  a  gradient-based  numerical  method  for  solving  the 
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optimal  control  problem  [83],  which  requires  continuous  and  differentiable  functions. 
Therefore  a  modified  inside-outside  function  is  developed  via  a  sigmoid  function  which 
emulates  boolean-like  behavior,  but  provides  a  continuous  and  differentiable  function 
between  the  inside  and  outside  of  the  threat.  Represented  through  Equation  3.16 
and  graphically  in  Figure  3.3,  values  inside  the  threat  approach  zero,  values  outside 
the  threat  approach  one  and  there  is  a  smooth  transition  from  zero  to  one  at  the 
border  region  of  the  threat.  The  stiffness  of  the  transition  is  represented  by  s  in 
Equation  3.16.  Its  value  is  a  user-defined  choice  which  balances  threat  exposure 
modeling  error  with  smooth  gradient  information  required  for  solving  optimal  control 
problems  with  gradient-based  NLP  solvers.  As  an  example,  referring  to  Figure  3.3, 
with  s  =  30  in  an  avoidable  threat  layout,  it  is  likely  that  the  optimizer  will  converge 
on  trajectories,  that  when  evaluated  with  Equation  3.14,  values  of  F  will  be  larger 
slightly  larger  than  1. 


<F( h)) 


1 

1  +  eO(h)-l)) 


(3.16) 


Figure  3.3.  Modified  Threat  Exposure  Function 

This  modified  inside-outside  function  may  now  be  used  as  a  running  component  of 
the  cost  formulation  to  minimize  threat  exposure.  Multiple  threats  (n)  are  accounted 
for  by  taking  the  product  of  the  modified  inside-outside  function  for  each  threat  [8]. 
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Using  the  modified  inside-outside  threat  product  function,  a  value  of  1  indicates  a  path 
in  which  all  threats  are  avoided.  In  order  to  formulate  this  cost  as  a  minimization, 
the  modified  inside-outside  threat  product  function  is  subtracted  from  1  in  a  running 
cost,  indicated  in  Equation  3.17, 

[if  ” 

J  MinExposure  /  ))dt.  (3.17) 

Jt0  i=1 

Establishing  the  modified  inside-outside  threat  product  function  as  a  part  of  the  cost 
formulation  eliminates  the  need  to  establish  threats  as  inequality  constraints,  while 
simultaneously  avoiding  threats,  and  minimizing  exposure  when  threats  are  unavoid¬ 
able.  Additionally,  when  using  direct  orthogonal  collocation  to  transcribe  the  optimal 
control  problem  to  a  nonlinear  programming  problem,  inequality  constraints  increase 
the  size  and  complexity  of  the  Jacobian,  generally  increasing  computation  time  and 
decreasing  likelihood  of  generating  a  feasible  solution.  Thus,  it  is  recommended  that 
threats  in  the  loyal  wingman  application  are  always  formulated  as  part  of  the  cost 
function  and  not  as  inequality  constraints. 

Threat  exposure  is  the  primary  component  of  a  fixed-time  cost  formulation,  how¬ 
ever  a  second  component,  minimize  control,  is  added  to  ensure  a  smooth  control 
output: 

J  Control  =  f  u(t)2dt.  (3.18) 

Jt0 

By  adding  the  minimize  exposure  and  minimize  control  components  and  applying 
a  convex  weighting  f3  e  [0, 1],  the  cost  function  formulation  of  a  fixed  time  scenario  is 


dpixedTime  (1  P)  J control  T  \P)  J  MinExposure- 


Next,  a  formulation  is  required  for  scenarios  in  which  minimizing  mission  time 
is  a  priority.  A  minimum  time  formulation  is  developed  with  a  final  condition  cost 
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component,  tf.  Because  minimizing  threat  exposure  and  control  output  are  still 
desirable,  the  minimize  time  scenarios  are  formulated  by  adding  the  final  condition 
component  to  the  JFixedTime  component  and  applying  convex  weighting  a  G  [0,1], 
resulting  in  Equation  3.19. 

J MinTime  T  (1  Ot)  J FixedTime  (3.19) 

These  components  of  the  optimal  control  problem  formulation  represents  the  six 
scenarios  that  will  be  evaluated  in  the  research  herein: 

1.  Fixed  time  to  fixed  location  with  avoidable  threats,  JFixedTime 

2.  Fixed  time  to  fixed  location  with  unavoidable  threats,  JFixedTime 

3.  Minimize  time  to  fixed  location  with  avoidable  threats,  J  MinTime 

4.  Minimize  time  to  fixed  location  with  unavoidable  threats,  JMinTime 

5.  Minimize  time  to  lead  rendezvous  with  avoidable  threats,  JMinTime 

6.  Minimize  time  to  lead  rendezvous  with  unavoidable  threats,  JMinTime 

3.3  Loyal  Wingman  Problem  Formulation  Summary 

The  optimal  control  problem  formulation  for  scenarios  5  and  6  is  to  minimize  the 
cost 

nt  j  ^ 

J  =  atf  +  (  1-a)  /  [(1  -  /3)u(t)2  +  -  (3.20) 

Jto  i=1 
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where  e  and  F  are  defined  in  Equations  3.16  and  Equation  3.14,  respectively,  subject 
to  dynamic  constraints, 


x  —  V  cos  i/j 
y  =  V  sin-?/? 

Tp  =  U, 


boundary  conditions, 


4(b)  =  x0 

X9W(t9f)  =  Xg 

y9w(t9f)  =  y9 

xwttf)  =  xl(*/)? 

and  linkage  constraints, 

xw(df)  =  xfy\t90+1) 

+9  _  +9+1 
Lf  ~  L0  > 


(3.21a) 

(3.21b) 

(3.21c) 


(3.22a) 

(3.22b) 

(3.22c) 

(3.22c!) 


(3.23a) 

(3.23b) 


\/g  =  1,  2,  ...G—  1,  G  representing  the  number  of  phases.  When  the  final  state  location 
is  fixed,  such  as  in  scenarios  1,  2,  3,  and  4,  Equation  3.22c!  must  be  updated, 


xm0/)  =xg- 


(3.24) 


Finally,  if  the  final  time  is  fixed,  as  in  scenarios  1  and  2,  the  cost  function  to  minimize 
is 


J 


[(!  ~  P)u(t)2  +  P(l  ~  nei(^(h)))]dt 

i=  1 


(3.25) 


55 


and  an  additional  final  boundary  condition  constraint  is  enforced, 


*/  = 


(3.26) 


3.4  Initial  Guess  Results 

The  work  herein  used  the  Gauss  Pseudospectral  Optimization  Software  (GPOPS 
II)  [143]  which  is  a  multi-purpose  MAT  LAB®  -based  [79]  transcription  software. 
GPOPS  II  uses  the  roots  of  the  Legendre  polynomial  as  discretization  points  as 
well  as  choosing  the  Radau  Pseudospectral  Method  (RPM)  which  places  collocation 
nodes  at  the  initial  condition  and  in  the  interior,  but  not  at  the  final  condition.  The 
intermediate-target  optimal  control  problem  includes  an  intermediate  target,  which  is 
modeled  using  multiple  phases  such  that  the  final  conditions  for  phase  1  are  the  initial 
conditions  for  phase  2  and  so  on  between  each  phase.  A  challenge  with  gradient-based 
NLPs,  especially  with  nonlinear,  nonconvex  problems  is  that  an  initial  guess  is  nec¬ 
essary  and  the  choice  for  the  initial  guess  impacts  computation  time  as  well  as  which 
locally  optimal  solution  the  NLP  converges  upon.  In  order  to  demonstrate  this  chal¬ 
lenge,  a  single  scenario,  fixed  time  to  fixed  location  through  unavoidable  threats,  is 
formulated  and  solved  by  supplying  various  initial  guesses  to  the  NLP.  Nine  ‘simple’ 
initial  guesses  are  developed  by  a  method  that  can  be  visualized  in  Figure  3.4.  In 
phase  one,  three  paths  are  created  that  connect  the  initial  condition  to  the  interme¬ 
diate  target:  an  arc  to  the  left,  an  arc  to  the  right,  and  a  straight  path,  labeled  1 
through  3.  The  same  is  done  for  phase  two  and  each  segment  labeled  4  through  6. 
This  represents  9  potential  initial  guesses.  Each  of  these  initial  guesses  is  supplied 
to  the  NLP  and  the  resulting  cost  and  computation  time  for  each  initial  guess  is 
documented  in  Table  3.1  and  plotted  in  Figure  3.5.  In  this  particular  scenario,  each 
initial  guess  supplied  to  the  NLP  converged  to  a  different  locally  optimal  solution. 
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Figure  3.4.  Initial  Guess  Segments  Supplied  to  the  NLP 


Table  3.1.  Cost  and  Computation  Time  of  9  Initial  Guesses 


Segment 

Cost 

Computation  Time 

1-4 

16.49 

1.76 

1-5 

14.94 

1.99 

1-6 

11.19 

2.46 

2-4 

15.03 

2.36 

2-5 

14.17 

2.24 

2-6 

15.51 

4.26 

3-4 

20.33 

2.12 

3-5 

15.65 

4.38 

3-6 

23.83 

3.64 
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The  best  cost  from  these  nine  results  was  from  the  initial  guess  segment  1-6,  plotted 
in  black  in  Figure  3.5.  NLP  optimization  tolerance,  number  and  location  of  nodes, 
the  convex  weighting,  as  well  as  the  initial  guess  are  all  factors  that  contribute  to 
which  locally  optimal  solution  the  NLP  ultimately  converges  upon.  Additional  work 
may  be  applied  to  determining  the  right  combination  of  NLP  optimization  tolerance, 
node  placement,  convex  weighting  and  initial  guess.  This  work,  however,  chooses 
to  ameliorate  this  challenge  using  a  hybrid  optimization  methodology  for  achieving 
rapid  and  autonomous  solutions,  which  is  demonstrated  in  the  next  chapter. 


3.5  Optimal  Control  Problem  Formulation  Conclusions 

Research  question  one  was  answered  by  formulating  the  optimal  control  problem 
in  a  static  threat  environment  using  a  3-state,  2-D  model.  The  intermediate  target 
conditions  were  introduced  by  breaking  the  problem  into  multiple  phases  and  link¬ 
ing  initial  and  final  conditions  for  each  phase.  Threat  regions  were  modeled  using 
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superquadrics  and  the  inside-outside  function.  To  ensure  threat  avoidance  and  when 
necessary  minimize  exposure,  a  modified  inside-outside  threat  product  function  was 
developed  and  employed  as  a  component  of  the  running  cost.  Two  cost  functions  were 
established,  one  for  a  fixed  time  scenario  and  the  other  for  a  minimize  time  scenario. 
This  chapter  utilized  a  single  scenario  to  highlight  challenges  with  a  highly  nonlinear, 
nonconvex  optimal  control  problems.  Multiple  initial  guesses  were  provided  to  the 
NLP,  resulting  in  convergence  to  multiple  locally  optimal  solutions.  This  challenge 
will  be  ameliorated  through  the  use  of  a  hybrid  optimization  technique  demonstrated 
in  the  next  chapter. 
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IV.  Hybrid  Optimization  Methodology 


The  purpose  of  this  chapter  is  to  demonstrate  the  hybrid  optimization  methodol¬ 
ogy  in  which  a  heuristic-based  Particle  Swarm  Optimization  (PSO)  algorithm  is  used 
to  supply  an  initial  guess  to  the  gradient-based  nonlinear  programming  (NLP)  solver. 
The  previous  chapter  formulated  the  optimal  control  problem  which  was  then  tran¬ 
scribed  to  an  NLP  problem  using  direct  orthogonal  collocation  (DOC).  This  chapter 
exploits  the  myriad  of  variations  of  the  PSO  method  to  produce  a  PSO  algorithm  tai¬ 
lored  to  the  loyal  wingman  optimal  control  problem  with  the  specific  task  of  rapidly 
providing  an  initial  guess  to  the  NLP.  The  hybrid  technique  as  applied  to  the  loyal 
wingman  optimal  control  problem  is  considered  successful  because  it  produces  a  rapid 
and  feasible,  converged  NLP  solution.  The  chapter  begins  by  describing  a  PSO  al¬ 
gorithm  written  for  the  loyal  wingman  optimal  control  problem,  which  highlights 
methods  for  producing  seeds,  handling  constraints,  and  calculating  costs.  Then,  a 
simulation  is  run  and  a  metric  is  established  to  compare  the  speed  and  accuracy 
of  the  hybrid  technique  to  using  DOC  alone  and  concludes  that  the  loyal  wingman 
optimal  control  problem  may  be  solved  effectively  using  the  hybrid  optimal  control 
methodology. 

4.1  The  Loyal  Wingman  PSO  Algorithm 

A  flowchart  of  the  loyal  wingman  PSO  algorithm  is  shown  in  Figure  4.1,  which 
is  broken  into  two  major  sections:  algorithm  initialization  and  algorithm  iterations. 
There  are  many  factors  in  considering  how  a  PSO  algorithm  should  be  varied  for  a 
particular  application.  Section  4.2  will  show  the  PSO  algorithm  written  for  the  loyal 
wingman  application  as  effective  for  achieving  rapid,  accurate  results. 
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Figure  4.1.  PSO  Algorithm  Flowchart 
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4.1.1  PSO  Seeds  and  Initialization. 


Referring  to  Figure  4.1,  the  first  section  is  the  production  of  seeds  and  initialization 
of  the  components  of  the  basic  two-line  PSO  algorithm  used  in  the  loyal  wingman 
PSO  algorithm, 

v{k  +  1  )j  =  K[v(k)j  +  bLri  <®  ( u(k)j  -  Lj)  +  bGf2  ®  ( u{k)j  -  G )]  (4.1a) 

u(k  +  1)^  =  u(k)j  +  v(k  +  1  )j,  (4.1b) 

Vj  =  1,  2, ...,  M  particles,  Wk  —  1,  2, ...,  R  iterations,  where  ®  represents  element-wise 
multiplication  of  vector  components  and  fj ,  f2  represent  vectors  of  nondetcrministic 
evenly  distributed  parameter  weighting  in  [0, 1].  The  remaining  components,  v,  u,  L, 
G,  bL ,  bG,  and  K,  are  described  next. 

There  are  various  ways  to  define  a  particle  and  a  recommendation  of  this  research 
is  to  explore  other  methods  for  increased  efficiency.  However,  the  work  herein  defines 
a  particle  as  a  vector  of  discrete  control  inputs,  u  =  [ui,u2,  ■■■Un]t ,  where  N  is  the 
number  of  discrete  control  inputs.  Simulation  of  Equation  3.21  with  the  heading  rate 
control  vector,  u,  produces  a  trajectory  for  use  in  evaluating  the  cost  and  constraints. 
A  deterministic  method,  described  in  Appendix  A,  was  developed  for  producing  par¬ 
ticle  seeds  based  on  the  following  criteria:  satisfy  target  and  endpoint  constraints  as 
recommended  by  Hu  and  Ebert  [109],  create  a  broad  range  of  possible  trajectories 
to  aid  in  the  PSO  stochastic  search,  and  produce  in  a  computationally  efficient  man¬ 
ner.  Using  this  deterministic  method,  an  initial  set  of  M  control  vectors  is  produced 
u(0)ixM  =  [«(0)i,u(0)2,  ...m(0)m],  which  when  simulated  produce  an  initial  set  of  M 
discrete  vectors  for  each  of  the  3-states,  {x(0)ixm,  y(0)iXAf,  ^(OjixM} 

The  PSO  algorithm  is  initialized  by  assigning  each  initially  produced  seed  as  its 
own  current  local  best,  Lj  =  u(0)j,  Vj  =  1,2 The  cost  associated  with  each 
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seed,  as  described  later  in  this  chapter,  is  determined  and  the  index  of  the  particle 
with  the  best  cost  is  assigned  j*,  such  that  the  global  best  is  G  =  Lj*. 

The  first  component  of  Equation  4.1  is  the  constriction  factor 

K  =  - - 4==  (4,2) 

(p  -  y/ (f)2  -  A(p 

with  0  =  bL  +  bc,  0  >  4,  suggested  by  Clerc  [100],  where  bi  and  bo  represent  the  local 
(L)  and  global  (G)  weighting  factors.  The  choice  of  these  parameters  effect  the  local 
and  global  nature  of  the  search  as  well  as  convergence  tendencies  [99].  These  values 
were  chosen  for  use  in  the  loyal  wingman  PSO  algorithm  through  an  experiment  as 
described  in  Appendix  C. 

Finally,  the  velocity  component  v(0)j  is  initialized  to  0,  Vj  =  1,  2, ...,  M.  There  are 
many  ways  to  initialize  a  PSO  algorithm  and  additional  work  could  be  done  to  tune 
the  various  parameters  for  a  given  set  of  scenarios.  This  work  does  not  claim  to  have 
found  the  perfect  combination  for  optimal  performance,  however,  the  initialization 
described  in  this  section  is  sufficient  to  accomplish  the  desired  task,  which  is  to 
rapidly  provide  an  initial  guess  to  the  gradient-based  NLP  that  results  in  an  accurate 
and  feasible  solution.  Additional  work  in  determining  the  right  seeds  and  parameters 
may  further  improve  the  quality  and  efficiency  of  the  results. 

4.1.2  PSO  Iterations. 

After  the  PSO  is  initialized,  a  series  of  steps  occur  which  allows  the  ‘flock’  of 
control  vectors,  or  ‘particles’,  to  change  values  through  a  stochastic  search  of  the 
space,  moving  toward  the  currently  assigned  global  best  solution.  Beginning  with  the 
first  iteration,  k  —  1,  a  single  particle,  Uj,  is  updated  through  Equation  4.1.  The 
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updated  particle  is  simulated  using  the  discrete  equations  of  motion 


Xj(i  +  1)  =  V  cos(i(}j(i))At  +  Xj(i) 

(4.3a) 

yj(i  +  1)  =  V  sm('ipj(i))At  +  yj{i) 

(4.3b) 

+  1)  =  u{i)At  + 

(4.3c) 

Vi  =  1,  2, N  —  1,  where  V  is  the  velocity,  held  constant,  and  At  remains  fixed  for 
all  particle  and  iterations. 

The  updated  particle  is  then  evaluated  in  Figure  4.1,  Box  6,  to  check  constraint 
criteria.  Intermediate  and  final  conditions  constraint  criteria  are  relaxed  in  the  PSO 
to  improve  the  PSO  search  capability  for  a  reasonable  computation  time.  The  loss 
of  fidelity  through  this  relaxation  is  overcome  by  speed  and  accuracy  of  the  direct 
orthogonal  collocation  methodology.  The  Euclidean  distance  from  all  points  in  the 
trajectory  to  the  intermediate  target  (x\,yi)  are  calculated 

Dj{i)  =  ^J\xj{i)-x  i]2  +  [yj(i)-yi]2  (4.4) 

Vi  =  1,  2, ...,  N.  If  the  minimum  D(i),  \/D(i),  1, 2, _ ,  iV  is  less  than  the  D *  threshold 

for  meeting  the  intermediate  target  constraint,  then  the  same  formulation  is  used  to 
check  the  final  condition  constraint,  (xfc,yfc).  If  either  constraint  threshold  is  not 
met,  the  PSO  velocity  component  is  set  to  0,  Vj  =  0  [105]  and  the  next  particle  is 
evaluated  Vj  =  1,2,  ...,M.  D *  is  set  to  5  km,  much  higher  than  the  criteria  for  the 
primary  problem  formulation.  This  is  done  to  increase  the  number  of  particles  on 
each  iteration  satisfying  boundary  conditions  and  not  being  ‘thrown  out’,  improving 
computation  time  and  subsequently  relying  on  the  speed  and  accuracy  of  the  DOC 
method  to  regain  lost  fidelity. 

When  a  particle  meets  both  intermediate  and  final  condition  constraints,  the  cost 
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is  then  evaluated  (Figure  4.1,  Box  7).  A  separate  cost  function  is  established  for  each 
boundary  condition  scenario: 

1.  Minimize  time  to  fixed  location,  Equation  4.5 

2.  Fixed  time  to  fixed  location,  Equation  4.6 

3.  Minimize  time  to  lead  rendezvous,  Equation  4.7 

The  minimize  time  to  fixed  location,  scenario  1,  is 

rtf  n 

J MinTimePSO  atf  +  (l  —  a)  /  [(1  -  /3)u(t)2  +  /3(l  -  ENF-(h»)]*-  <4-5> 

i=1 

A  fixed  time,  tfc,  scenario  contains  an  additional  constraint.  Instead  of  checking 
to  ensure  the  constraint  is  met  on  each  iteration  (Box  6),  the  fixed  time  constraint 
is  formulated  as  an  additional  component  in  the  cost  function  [110],  Jm  =  tf  —  tfc, 
such  that, 

p t  f  n 

J FixedTimePSO  &ft J\t T ( 1  cxft )  /  [(l-/3)u(t)2  +  /3(l-]^[ei(Fi(h)))]dt.  (4.6) 

JtO  i=l 

The  convex  weighting  a  ft  £  [0, 1]  is  adjusted  to  put  increased  emphasis  on  the  fixed 
final  time,  such  that  as  the  particle’s  variation  from  the  desired  final  time  increases, 
the  cost  increases  and  when  there  is  no  variation  from  the  desired  final  time,  this 
component  goes  to  zero  and  only  the  running  cost  remains. 

The  scenario,  minimize  time  to  rendezvous  with  the  lead,  JRendezvousPso,  has  an 
additional  constraint  that  the  states,  x,  of  the  loyal  wingman  (subscript  W)  and  lead 
(subscript  L)  match  at  the  final  time,  x^(tj)  =  Xi(t/).  The  cost  function  to  minimize 
time  to  rendezvous  while  minimizing  threat  exposure  is 


J RendezvousPSO  (1  ®-Rend)  J  MinTimePSO  T  O  Rend  D  fpncl , 


(4.7) 
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where  DRenci  is  the  minimum  Euclidean  distance  between  the  loyal  wingman  and  lead 
trajectory  at  each  time-step,  from  the  index  of  the  intermediate  waypoint,  i*,  forward, 


DRend  =  mm{\J [xw(i)  ~  xL(i)]2  +  [yw(i)  ~  Z/z,(«)]2}  (4.8) 

Vi  =  i*,i*  +  1,  ...,N  and  aRend  £  [0, 1]  is  weighted  such  that  as  D Rend  increases,  the 
cost  increases  and  when  DRen(]  goes  to  0,  this  component  goes  to  zero. 

The  cost  of  the  current  particle  is  then  evaluated  according  to  the  appropriate 
cost  for  the  scenario  using  Gaussian  quadrature  and  then  compared  to  the  cost  of  the 
particle’s  current  local  best,  Lj  (Figure  4.1,  Box  8).  If  the  updated  particle’s  cost  is 
lower,  then  the  particle’s  local  best  is  updated,  Lj  =  Uj  (Figure  4.1,  Box  9).  The 
process  for  a  single  iteration,  k,  is  repeated  for  all  particles  j  =  1,2 ,...,M,  which 
is:  update  particle  through  Equation  4.1  (Box  4),  simulate  using  Equation  4.3  (Box 
5),  check  constraint  criteria  (Box  6),  evaluate  cost  (Box  7),  and  update  local  best 
(Boxes  8  and  9).  After  all  particles  have  been  updated  and  evaluated  (Box  10),  if  any 
local  bests  have  been  reassigned,  then  the  costs  of  the  new  local  bests  are  compared 
to  the  current  global  best.  If  a  particle’s  local  best  cost  is  lower,  the  global  best  is 
updated,  G  =  Lj *  (Box  11).  This  completes  one  iteration  of  the  PSO  algorithm.  The 
iterations  continue  \/k  =  1,2,  . ..,7?  until  the  iteration  limit  has  been  reached.  The 
global  best,  G  at  the  final  iteration  is  the  solution  used  as  the  initial  guess  to  supply 
to  the  gradient-based  NLP. 

4.2  Results 

The  loyal  wingman  optimal  control  problem  is  solved  using  three  methods,  direct 
orthogonal  collocation  (DOC)  using  a  gradient-based  NLP,  a  heuristic  particle  swarm 
optimization  (PSO)  and  a  hybrid  technique  in  which  the  output  from  the  PSO  algo- 
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rithm  is  used  as  an  initial  guess  for  DOC.  The  DOC  method  was  run  nine  times  using 
nine  different  initial  guesses,  and  repeated  for  each  scenario.  Using  the  GPOPS  II 
software,  anywhere  from  4  to  9  different  locally  optimal  feasible  solutions  were  pro¬ 
duced  by  the  NLP  solver.  The  costs  of  these  different  solutions  were  compared  and 
the  lowest  cost  output  was  identified  as  the  ‘best’  solution  from  the  DOC  method. 
Computation  time  for  the  DOC  method  is  measured  as  the  total  time  it  takes  to  run 
GPOPS  II  and  the  NLP  solver  for  all  nine  initial  guesses.  An  example  scenario,  fixed 
time  to  fixed  point  through  unavoidable  threats,  was  discussed  in  Chapter  III  and 
the  results  captured  in  Table  3.1  and  Figures  3.4  and  3.5. 

Next,  the  PSO  algorithm  was  run  for  each  scenario  for  a  pre-determined  100 
iterations.  The  cost  and  computation  time  were  captured  along  with  a  graphical  rep¬ 
resentation  of  the  currently  assigned  global  best  solution  when  the  100  max  iteration 
limit  was  achieved. 

Finally,  a  hybrid  technique  is  tested  by  taking  the  output  of  the  previously  men¬ 
tioned  PSO  algorithm  and  supplying  it  as  the  initial  guess  into  the  DOC’s  gradient- 
based  NLP.  The  cost  output  is  captured  and  the  computation  time  is  computed  as 
the  combined  time  to  run  the  PSO,  and  the  DOC  with  the  PSO  output  as  the  initial 
guess.  The  following  six  figures  are  provided  which  overlays  the  trajectory  results  from 
the  DOC  method  alone  (dotted  line),  the  PSO  method  alone  (dashed  line)  and  the 
hybrid  method  (solid  line).  A  table  in  each  figure  identifies  the  cost  and  computation 
time  associated  with  each  method. 

In  order  to  achieve  an  accurate  solution,  it  is  expected  that  when  threat  regions 
are  avoidable,  each  method  should  find  trajectories  that  successfully  avoids  the  threat 
regions.  In  cases  were  threat  regions  are  unavoidable,  it  should  be  expected  that  the 
best  way  to  minimize  time  of  exposure  for  a  constant  velocity  vehicle  and  equally 
weighted  threat  regions  is  to  traverse  the  threats  by  way  of  a  perpendicular  bisector 
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Figure  4.2.  Comparison  of  Fixed  Time  Through  Unavoidable  Threats 


of  the  threat  intersection  points.  Minimum  time  scenarios  should  result  in  trajectories 
which  are  direct,  while  fixed  final  time  scenarios  should  produce  additional  turns  that 
allow  for  idle  time  in  order  to  meet  the  fixed  final  time  constraint. 

Figure  4.2  represents  scenario  one  in  which  the  vehicle  must  overfly  an  intermediate 
target  in  a  layout  in  which  threats  are  not  avoidable,  and  conclude  with  rendezvous 
at  the  fixed  final  time  and  specified  location.  All  three  methods  find  trajectories 
that  minimize  exposure  through  unavoidable  threat  regions  with  a  perpendicular 
bisector  and  the  DOC  method  produces  a  trajectory  with  a  lower  cost  than  does 
the  PSO  alone.  However,  the  hybrid  method  produces  the  lowest  cost  solution  at  a 
computation  time  that  is  faster  than  the  DOC  method  alone. 

Figure  4.3  represents  the  results  of  scenario  two  in  which  the  vehicle  must  overfly 
an  intermediate  target  in  a  layout  in  which  threat  regions  are  unavoidable  and  must 
conclude  at  the  final  rendezvous  in  minimum  time.  The  DOC  solution  produces  a 
lower  cost  solution  than  the  PSO  method  but  takes  nearly  twice  the  computation 
time.  The  hybrid  method  produces  the  exact  same  solution  as  the  DOC  method 
alone,  but  does  so  in  a  more  computationally  efficient  manner.  The  time  it  takes  to 


Figure  4.3.  Comparison  of  Minimize  Time  Through  Unavoidable  Threats 


run  the  hybrid  method  is  nearly  30%  faster  than  the  DOC  method  alone. 

Figure  4.4  represents  the  results  of  scenario  three  in  which  the  vehicle  must  overfly 
an  intermediate  target  prior  to  a  fixed  final  time  and  fixed  final  location  rendezvous. 
This  mission  can  be  accomplished  without  exposure  to  any  threat  regions  and  all 
methods  do  so  successfully.  The  results  are  similar  to  the  scenario  previously  discussed 
where  the  cost  of  the  DOC  solution  is  lower  than  the  cost  of  the  PSO  solution,  but 
to  run  the  DOC  solution  nine  times,  once  again  takes  twice  the  computation  time. 
In  this  case,  the  hybrid  method  outputs  the  same  trajectory  and  cost  associated  with 
the  DOC  method  alone,  but  does  so  12  seconds  faster,  in  approximately  40%  of  the 
time  it  takes  to  run  the  DOC  method  alone. 

Figure  4.5  is  scenario  four,  representing  a  vehicle  that  must  overfly  an  intermedi¬ 
ate  target  prior  to  rendezvous  at  a  final  location.  The  mission  may  be  accomplished 
without  exposure  to  threats,  which  all  methods  obtain  successfully.  What  is  interest¬ 
ing  about  this  scenario  is  the  DOC  method  by  itself,  which  was  run  with  nine  different 
initial  guesses,  returns  a  best-cost  trajectory  that  flies  North  (positive  y-direction)  of 
the  isolated  threat  on  its  way  to  the  final  rendezvous.  This  trajectory  adds  unnec- 


69 


300 

250 

Method 

Cost 

CompTime 

o  Start 

«  Intermediate  Target 
•  Final  Rendezvous 
□Threat 
■■■PSO  Solution 

PSO 

02.797 

14.323 

Best  DOC  Solution 

—  200 

DOC 

02.139 

28.159  _ _ 

—  Hybrid  Solution 

W 

Hybrid 

02.139 

16.867 

X*. 

X 

ctf 

><150 


100 

50 


-&)0  -150  -100 


-50  0  50  100 

East,  x-axis  (km) 


150  200 


250 


Figure  4.4.  Comparison  of  Fixed  Time  Through  Avoidable  Threats 


Figure  4.5.  Comparison  of  Minimize  Time  Through  Avoidable  Threats 
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Figure  4.6.  Comparison  of  Minimize  Time  to  Rendezvous  Through  Unavoidable 
Threats 


essary  cost  in  a  minimum  time  problem  and  its  result  is  an  example  of  one  of  the 
challenges  faced  by  the  gradient-based  search  methods.  As  the  NLP  takes  small  steps 
in  a  direction  to  minimize  the  time,  the  trajectory  is  exposed  to  the  threat  region 
and  bounces  back  above,  effectively  being  caught  in  a  local  minimum.  The  global 
and  stochastic  nature  of  the  PSO  results  in  a  more  direct,  time-saving  route.  When 
this  PSO  result  is  supplied  as  the  initial  guess  into  the  NLP  in  a  hybrid  approach,  a 
lower  cost  solution  is  found  in  less  than  half  the  computation  time  to  run  the  DOC 
method  alone. 

Figures  4.6  and  4.7  are  the  results  for  scenarios  five  and  six  which  both  entail  a 
minimum  time  to  rendezvous  with  the  manned  lead.  The  threat  layout  is  the  same 
as  was  provided  in  previous  figures  with  the  addition  of  the  manned  lead’s  trajectory. 
In  Figure  4.6,  the  trajectory  must  minimize  exposure  through  unavoidable  threat 
regions.  In  this  scenario,  neither  the  PSO  nor  DOC  method  alone  produce  a  desired 
trajectory  passing  through  the  threat  perpendicular  bisector.  The  PSO  solution, 
however,  is  sufficient  to  supply  as  an  initial  guess  to  the  NLP,  demonstrating  the 
desired  behavior  of  the  hybrid  methodology.  Additionally,  each  of  the  DOC  runs  alone 
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Figure  4.7.  Comparison  of  Minimize  Time  to  Rendezvous  Through  Avoidable  Threats 


took  an  average  of  12  seconds,  totaling  almost  110  seconds  when  running  nine  times 
and  the  end  result  is  a  trajectory  that  does  not  find  the  perpendicular  bisector.  When 
using  the  hybrid  method,  the  PSO  took  about  16  seconds,  but  the  NLP  only  took 
4  additional  seconds  to  converge  on  a  lower  cost  solution  that  minimizes  exposure. 
The  hybrid  method  took  less  than  20%  of  the  time  to  run  as  the  DOC  method  and 
produced  a  lower  cost  solution. 

Scenario  six  results  are  shown  in  Figure  4.7,  where  all  methods  return  trajectories 
that  successfully  avoids  threat  regions.  When  running  the  DOC  method  alone  nine 
times,  a  good  solution  is  found  at  the  expense  of  124  seconds  in  computation  time. 
The  hybrid  method  found  the  same  cost  solution  in  21  seconds  -  17%  of  the  time  it 
takes  to  run  the  DOC  method  alone. 

Table  4.1  provides  a  comparison  of  cost  and  computation  time  for  all  three  meth¬ 
ods  for  each  of  the  six  scenarios  and  these  results  are  summarized  in  the  next  section. 
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Table  4.1.  Cost  and  Computation  Time  Comparison  for  All  Scenarios 


Seen  1 

Seen  2 

Seen  3 

Cost 

Time 

Cost 

Time 

Cost 

Time 

PSO 

17.588 

14.981 

2.806 

17.680 

2.797 

14.323 

DOC 

11.194 

25.242 

2.251 

29.159 

2.139 

28.159 

Hybrid 

9.243 

17.410 

2.251 

20.193 

2.139 

16.867 

Seen  4 

Seen  5 

Seen  6 

Cost 

Time 

Cost 

Time 

Cost 

Time 

PSO 

1.560 

17.745 

3.030 

15.950 

2.790 

17.530 

DOC 

1.539 

52.948 

2.590 

107.150 

1.540 

124.780 

Hybrid 

1.421 

20.320 

2.360 

20.400 

1.540 

21.380 

4.3  Hybrid  Methodology  Conclusions  and  Recommendations 

A  PSO  algorithm  was  developed  for  use  in  the  loyal  wingman  application  with  the 
goal  of  rapidly  providing  a  feasible  solution  to  supply  to  the  gradient-based  NLP.  PSO 
seeds  were  developed  deterministically  in  order  to  decrease  PSO  computation  time 
and  methods  of  handling  constraints  recommended  by  other  authors  were  applied. 
A  simulation  was  run  and  a  metric  established  for  comparing  DOC  alone  to  the 
hybrid  technique,  which  used  the  PSO  result  as  its  initial  guess.  In  general,  all  three 
methods  provided  feasible  results,  but  the  hybrid  method  was  superior.  The  DOC 
solution  was  dependent  upon  the  initial  guess  as  was  suggested,  producing  multiple 
locally  optimal  solutions.  One  run  of  the  DOC  method  is  fast,  but  if  the  user  doesn’t 
have  a  good  initial  guess  and  therefore  needs  to  run  the  method  multiple  times  to 
find  a  good  initial  guess,  then  the  computation  time  is  high.  The  PSO  algorithm 
used  for  this  application  was  rapid  and  found  the  best  cost  solution  within  the  space 
that  was  searched.  The  hybrid  method  consistently  resulted  in  a  lower  cost  than  the 
PSO  alone  and  either  was  even  with  or  lower  than  the  results  from  the  DOC  method 
alone.  In  all  cases,  the  hybrid  method  produced  results  faster  than  the  DOC  method 
alone.  Although  the  hybrid  method  consistently  provided  rapid,  accurate  solutions, 
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the  method  remains  imperfect.  There  is  no  way  to  guarantee  a  full  global  search  and 
even  though  the  NLP  tends  to  converge  on  the  locally  optimal  solution  in  the  region 
of  the  initial  guess,  NLP  step-size  may  cause  the  optimizer  to  ‘jump’  out  of  the  region 
of  the  initial  guess. 

There  are  many  factors  to  consider  when  tailoring  the  PSO  to  a  particular  ap¬ 
plication  such  as,  definition  of  particles,  production  of  particle  seeds,  the  use  of  a 
constriction  factor  and  its  value,  local  and  global  parameter  weighting  values,  and 
the  various  ways  of  handling  constraints  as  utilized  in  this  work.  These  factors  were 
tailored  for  the  loyal  wingman  application  and  there  are  potentially  more  factors  to 
consider.  A  recommendation  for  future  work  should  consider  continuing  to  adjust  the 
PSO  to  increase  the  stochastic  search  region  as  well  as  decreasing  the  computation 
time.  A  second  recommendation  for  future  work  is  to  study  the  definition  of  a  particle 
in  this  application  as  the  coefficients  to  a  polynomial,  discretized  at  the  roots  of  an 
orthogonal  basis  set.  There  is  potential  for  synergy  by  using  the  DOC  transcription 
of  the  optimal  control  problem  and  applying  it  for  use  in  the  PSO. 

Overall,  this  chapter  has  demonstrated  a  methodology  for  solving  the  loyal  wing- 
man  optimal  control  problem,  and  between  Chapters  III  and  IV,  answered  research 
question  one  by  formulating  and  solving  the  optimal  control  problem  in  a  static  threat 
environment. 
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V.  3-D  Results 


The  purpose  of  this  chapter  is  to  add  real-world  fidelity  to  the  results  of  the 
previous  chapter  by  answering  question  one  using  a  3-D  model.  First,  the  optimal 
control  problem  that  was  formulated  in  Chapter  III  is  updated  and  summarized  for  a 
3-D  model,  then  there  is  a  discussion  on  the  required  updates  to  the  PSO  algorithm. 
Results  are  provided  along  with  a  discussion  for  the  six  identified  scenarios,  followed 
by  a  conclusion  and  recommendations  for  future  work. 

5.1  Update  Optimal  Control  Problem  Formulation 

A  7-state,  3-DOF  model  suitable  for  the  loyal  wingman  problem  was  provided  in 
Chapter  II  and  will  be  used  to  provide  3-D  results.  The  initial  and  final  conditions 
for  scenarios  identified  in  Chapter  III  remain  unchanged,  however  the  intermediate 
target  constraint  is  updated  to  include  the  altitude  state,  z,  such  that 


xw(tf)  =  xw(to) 

=  Xl 

=  Vw(t  o) 

=  yi 

(5.1) 

zw(tf)  =  zw(t  o) 

=  Zi, 

where  the  superscripts  indicate  phases.  Finally,  the  2-D  inside-outside  function,  Equa¬ 
tion  3.14,  is  updated  for  a  3-D  threat, 

(”)2+(^)’  (5'2) 

where  h  =  [xw,  Uw ,  Zw,  xt,  Ut ,  zt,  a®,  ay,  az,p];  xw ,  Uw,  and  zw  indicate  the  position 
of  the  loyal  wingman  along  its  trajectory;  xt,  Ut ,  and  Zt  indicate  the  center  point  of 
the  threat;  ax,  ay,  and  az  are  the  principal  axes  of  the  shape  modeling  the  threat;  and 
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p  is  a  parameter  utilized  to  form  the  shape  of  the  superquadric,  where  p  =  2  forms 
a  circle  or  ellipse.  The  same  six  scenarios  identified  in  Chapter  III  are  applicable  for 
the  3-D  model 

1.  Fixed  time  to  fixed  location  with  avoidable  threats,  JFixedTime 

2.  Fixed  time  to  fixed  location  with  unavoidable  threats,  JFixedTime 

3.  Minimize  time  to  fixed  location  with  avoidable  threats,  JMinTime 

4.  Minimize  time  to  fixed  location  with  unavoidable  threats,  JMinTime 

5.  Minimize  time  to  lead  rendezvous  with  avoidable  threats,  JMinTime 

6.  Minimize  time  to  lead  rendezvous  with  unavoidable  threats,  JMinTime 


5.2  3D  Loyal  Wingman  Problem  Formulation  Summary 


The  optimal  control  problem  formulation  for  scenarios  5  and  6  is  to  minimize  the 


J  =  atf  +  (  1  —  a)  [(1  -  /3)[ui(t)2  +  u2(t)2}  +  /3(l  -  ]^[ei(.Fj(h)))](it,  (5.3) 


where, 


e(F(h)) 


£ max 

1  +  e-S(F(h)  —  1)  ’ 


(5.4) 


s  is  a  user  defined  stiffness  balancing  threat  border  region  accuracy  with  computa¬ 
tional  efficiency,  and  F  is  identified  in  Equation  5.2.  The  cost  is  subject  to  dynamic 
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constraints, 


boundary  conditions 


and  linkage  constraints, 


x{t )  =  V  cosy (t)  cos y(t) 

(5.5a) 

y(t)  =  G  cos  7(f)  sin  x(t) 

(5.5b) 

z(t)  =  V  sin  7(f) 

(5.5c) 

.  m  Wfl'  cos  n(t)  -  g  cos  7 (t) 

-  y 

(5.5d) 

.  /  _  Nzg  sin  fi(t) 

Geos  7(f) 

(5.5e) 

W(t)  =  Mi(f) 

(5.5f) 

g(t)  =  u2(t), 

(5.5g) 

(5.5h) 

xiv(to)  =  x0 

(5.6a) 

4^)  =  X9 

(5.6b) 

y9w(t9f)  =  ya 

(5.6c) 

Zw(t9f)  =  Z9 

(5.6d) 

xw(t(f)  = 

(5.6e) 

xw(t9f)  =  xw\t9o+1) 

(5.7a) 

+9  _  +9+ 1 

Lf  ~  L 0  1 

(5.7b) 

\/g  —  1,  2, ...,  G  —  1,  where  G  represents  the  number  of  phases  and  x  now  represents 
the  7  states  {x,  y,  z,  7,  y,  Nz,  g}.  When  the  final  state  location  is  fixed,  such  as  in 
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scenarios  1,  2,  3,  and  4,  Equation  5.7a  is  replaced  with 


)  =  xG  (5.8) 

Finally,  if  the  final  time  is  fixed,  as  in  scenarios  1  and  2,  the  cost  function  to  minimize 

is 

pt  f  ^ 

J=  [(l^/3)[Ml(i)2  +  Ul(i)2]+/3(l- JJei(^(h)))]di  (5-9) 

Jto  i=1 

and  an  additional  final  boundary  condition  constraint  is  enforced, 

tGf  =  tG.  (5.10) 

5.3  Update  Loyal  Wingman  PSO  Algorithm 

There  are  two  updates  to  the  loyal  wingman  PSO  algorithm.  One  is  the  use  of  a 
reduced-order  dynamics  model  and  the  other  is  the  method  of  producing  PSO  seeds. 

5.3.1  The  Reduced  Order  PSO  Dynamic  Model. 

A  positive  characteristic  of  a  full-state  model  is  the  accuracy  of  results,  which 
when  provided  to  the  NLP  as  an  initial  guess,  improves  NLP  computation  time.  Use 
of  the  full-state  model  in  the  loyal  wingman  PSO  was  more  challenging,  however, 
because  it  has  a  significant  negative  impact  on  the  speed  and  convergence  of  the 
PSO.  A  recommendation  of  the  work  herein  is  to  continue  to  improve  the  PSO, 
including  choice  of  the  PSO  parameters;  however,  this  work  chose  to  improve  the 
convergence  and  computation  time  of  the  PSO  by  using  a  reduced-order  dynamics 
model.  The  sacrifice  of  accuracy  from  a  reduced-order  PSO  is  deemed  acceptable  due 
to  the  increased  computational  efficiency  of  the  reduced-order  PSO  as  well  as  reliance 
upon  the  accuracy  achieved  by  use  of  the  DOC  method. 
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The  reduced-order  PSO  is  a  3-state,  2-control  model,  taken  from  Equations  5.5a,  5.5b, 


and  5.5c 


x(t )  =  V  cos  7(f)  cos  y(f) 

(5.11a) 

y(t)  =  V  cos  7 (t)  sinx(t) 

(5.11b) 

z(t )  =  V  sin  7 (f), 

(5.11c) 

where  x,  y ,  z  represent  position  coordinates,  and  the  controls  are  heading,  y,  and 
flight  path  angle,  7. 

5.3.2  PSO  Seeds  in  Three  Dimensions. 

Chapter  IV  established  a  deterministic  method  of  producing  seeds  to  initiate  the 
PSO  algorithm.  This  prescriptive  method,  described  in  Appendix  A,  is  useful  for  all 
loyal  wingman  scenarios  which  included  identifying  points  and  spline-fitting  trajecto¬ 
ries  through  those  points.  Identification  of  those  points  must  be  updated  for  use  in 
three  dimensions  and  is  described  in  Appendix  B. 

Loyal  Wingman  PSO  Algorithm  in  3-D. 

Each  particle  on  each  iteration  is  simulated  using  the  discrete  version  of  Equa¬ 
tion  5.11a,  5.11b,  and  5.11c, 


x[i  +  1)  =  V  cos  7(i)  cos  y(i)  At  +  x(i) 

(5.12a) 

y{i  +  1)  =  V  cos7(i)  siny(?')At  +  y(i) 

(5.12b) 

z(i  +  1)  =  V  sin  7  (i)  At  +  z(i). 

(5.12c) 

After  reaching  the  iteration  limit  of  the  loyal  wingman  PSO  algorithm  identified  in 
the  previous  chapter,  the  result  is  a  vector,  length  N,  of  evenly-spaced  state  elements 
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{x,y,z,  7,  x}.  The  NLP  still  requires  an  initial  guess  in  the  form  of  a  vector,  length 
N,  for  two  additional  states,  {Nz,  y]  and  controls,  {u±,  u2}.  This  can  be  done  by  first 
algebraically  rearranging  Equations  5.5d,  5.5e,  and  5.5f  to 


=  tan  1 
Nz(t)  = 


Xit)  cosy (t) 

7 (t)  +  f  cosy (t) 

7  (t)V 


cos  fi(t) 


+  cosy(t) 


(5.13a) 

(5.13b) 


then  discretizing  Equations  5.13a,  and  5.13b,  and  solving  with  a  backward  difference 
formulation, 


—  1)  =  tan  1 
Nz(i  -  1)  = 


xffl-xh-i) 

At 


cosy(i  —  1) 


—  At  1}  +  V  cos7(*  -  1) 

_  7(0  —  7(3  - 1)^ 

cos/r(i  — 1) 


+  cosy(i  —  1) 


(5.14a) 

(5.14b) 


\/i  —  N,  N  —  1, ...,  2.  This  provides  a  vector  length  N  —  1  for  vertical  acceleration,  1V2 
and  roll,  /i.  Equations  5.5f,  and  5.5g  are  also  discretized  and  solved  using  a  backwards 
differencing  formulation, 


ui(i  -  1) 
u2{i  -  1) 


Nz{i)-Nz(i-l) 

At 

n(i)  -  n(i  - 1) 

At 


(5.15a) 

(5.15b) 


Vi  =  N  —  1,  N  —  2, ...,  2  to  produce  a  vector,  length  A  —  2  of  control  inputs,  ii\  and  u2. 
A  discrete,  evenly  spaced  vector  for  each  state  and  control  is  now  available  to  supply 
as  an  initial  guess  to  the  NLP. 
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*  Intermediate  Target 

. Lead  Planned  Path 

—  Lead  Flown  Path 
■■■  Loyal  Wingman  Planned  Path 
—  Loyal  Wingman  Flown  Path 
^Known  Threat  Location 
^Propagated  Threat 

Figure  5.1.  Plot  Legend  for  Figures  5.2,  5.3,  5.4,  and  5.5 


5.4  Results 

The  six  identified  scenarios,  repeated  here  for  convenience,  were  run  to  present 
results  for  each  of  the  three  boundary  conditions  through  both  an  avoidable  and 
unavoidable  threat  layout. 

1.  Fixed  time  to  fixed  location  with  avoidable  threats,  JFixedTime 

2.  Fixed  time  to  fixed  location  with  unavoidable  threats,  JFixedTime 

3.  Minimize  time  to  fixed  location  with  avoidable  threats,  JMinTime 

4.  Minimize  time  to  fixed  location  with  unavoidable  threats,  JMinTime 

5.  Minimize  time  to  lead  rendezvous  with  avoidable  threats,  JMinTime 

6.  Minimize  time  to  lead  rendezvous  with  unavoidable  threats,  JMinTime 

A  legend  is  provided  in  Figure  5.1  and  is  used  to  describe  the  plots  found  in  the 
upcoming  results.  The  starting  and  intermediate  conditions  are  the  red  ‘circle’  and 
black  ‘x’,  respectively.  The  loyal  wingman’s  path  is  indicated  in  blue  and  the  lead’s 
path  is  marked  in  red.  A  dashed  line  indicates  a  planned  path  and  a  solid  line 
indicates  a  flown  path.  Known  static  threats  are  indicated  as  dark  shaded  regions. 
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North(km)  North(km) 


Figure  5.2.  Fixed  Time  to  Fixed  Point 
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Referring  to  Figure  5.2,  additional  quantities  are  included  in  two  sub-axes.  The  first 
axis  indicates  the  loyal  wingman’s  flown  path  exposure,  calculated  using  the  logarithm 
of  Equation  5.21.  Values  less  than  zero  indicate  increased  risk  of  exposure  from  being 
inside  the  threat  region.  The  second  axis  is  a  plot  of  the  altitude  of  the  loyal  wingman. 
All  altitude  plots  indicate  a  starting  position  of  15km,  intermediate  target  of  2km  and 
final  altitude  of  15km.  In  many  cases,  it  may  appear  from  a  2-D  display  that  loyal 
wingman  trajectory  is  going  through  a  threat,  when  in  reality,  the  trajectory  passes 
at  an  altitude  higher  than  the  threat’s  altitude.  Figure  2.1  provides  an  image  of  the 
3-D  contour  of  the  superquadric  threat  keep-out  region. 

Figure  5.2  includes  two  results  from  a  fixed-time,  fixed-location  boundary  con¬ 
dition  scenario.  Subplot  (a)  is  the  trajectory  through  avoidable  threats.  The  loyal 
wingman  trajectory  as  well  as  the  exposure  plot  indicate  that  all  threats  were  suc¬ 
cessfully  avoided.  The  trajectory  also  indicates  a  minimize  control  idle  flight  time  in 
order  to  meet  the  fixed  final  time  requirement.  The  computation  time  for  the  hybrid 
method  was  25.33  seconds,  where  the  PSO  took  18.73  of  those  seconds.  Subplot  (b) 
is  the  same  boundary  condition  scenario,  but  there  is  an  unavoidable  threat  layout 
due  to  a  fortified  target.  In  order  to  minimize  exposure,  the  optimizer  produces  a 
solution  which  is  the  perpendicular  bisector  of  the  overlapping  threats.  A  study  of  the 
altitude  plot  indicates  that  the  optimizer  additionally  took  advantage  of  the  altitude 
to  minimize  exposure  by  climbing  high  and  diving  into  the  threat  exposure  region, 
where  the  intermediate  target  is  located.  The  loyal  wingman  trajectory  then  climbs 
out  of  the  exposure  region  and  flies  a  serpentine  route  to  meet  the  fixed  final  time 
requirement.  The  computation  time  for  the  hybrid  method  was  21.83  seconds,  where 
the  PSO  took  16.58  of  those  seconds. 

Equation  5.2  may  grow  without  bound,  so  the  logarithm  is  used  to  scale  results  for  readability. 
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North(km)  North(km) 


Figure  5.3  includes  two  results  from  the  minimize  time  to  a  fixed  location  scenario. 
Subplot  (a)  provides  the  results  through  an  avoidable  threat  layout.  A  first  glance 


Figure  5.3.  Minimize  Time  to  Fixed  Point 

may  lead  one  to  believe  the  trajectory  will  be  exposed  to  the  threat,  but  a  closer  look 
at  the  exposure  and  altitude  sub-axes  shows  that  the  trajectory  successfully  avoids  all 
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threats  by  flying  at  an  altitude  higher  than  the  threat’s  altitude.  A  second  property 
of  the  resulting  trajectory  is  a  direct  route  to  the  intermediate  and  final  points  in 
order  to  minimize  time  while  avoiding  threats.  The  computation  time  for  the  hybrid 
method  was  42.68  seconds,  where  the  PSO  took  33.68  of  those  seconds.  Subplot  (b) 
is  the  same  minimize  time  to  fixed  point  boundary  condition  as  above,  however  this 
scenario  includes  an  unavoidable  threat  layout,  due  to  a  fortified  target.  Just  as  the 
previous  scenario,  at  first  glance  it  appears  the  trajectory  unnecessarily  flies  through 
an  exposure  region.  However,  a  look  at  the  two  sub-axes  indicate  that  the  trajectory 
flies  to  a  high  altitude,  then  dives  through  the  center  of  the  overlapping  threats  to 
meet  the  intermediate  target  condition.  The  trajectory  then  quickly  climbs  out  of 
the  threat  exposure  region  and  flies  direct  to  the  final  position  in  order  to  minimize 
time.  The  computation  time  for  the  hybrid  method  was  37.66  seconds,  where  the 
PSO  took  32.87  of  those  seconds.  Figure  5.4  provides  the  results  of  minimize  time 
to  rendezvous  with  the  manned  lead  through  an  avoidable  threat  layout.  In  order 
to  highlight  the  time-dependent  final  condition  associated  with  lead  rendezvous,  the 
scenario  is  presented  in  three  static  subplots  (a-c).  The  trajectory  successfully  avoids 
all  threats  by  flying  at  an  altitude  above  one  of  the  threats.  The  loyal  wingman  dives 
to  meet  the  intermediate  target  requirement,  then  climbs  directly  toward  the  manned 
lead’s  path.  The  final  subplot  indicates  a  trajectory  approaching  final  rendezvous  with 
the  manned  lead.  The  computation  time  for  the  hybrid  method  was  34.83  seconds, 
where  the  PSO  took  22.99  of  those  seconds. 

The  results  of  the  final  scenario,  Figure  5.5  are  presented  in  a  fashion  similar  to 
the  previous  scenario,  as  three  static  subplots  (a-c).  The  results  are  similar  to  the 
minimize  time  to  a  fixed  location  scenario,  Figure  5.3,  in  which  the  trajectory  flies 
at  an  altitude  higher  than  the  threat,  then  dives  through  a  perpendicular  bisector 
towards  the  intermediate  waypoint  in  order  to  minimize  exposure.  There  is  then  a 
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(a)  Time  =  5.25  minutes 


(b)  Time  =  11.25  minutes 


(c)  Time  =  24  minutes 


Figure  5.4.  Minimum  Time  to  Rendezvous  with  Lead  Through  Avoidable  Threats 


(a)  Time  =  5.5  minutes 


(b)  Time  =  12.5  minutes 


(c)  Time  =  24  minutes 


Figure  5.5.  Minimum  Time  to  Rendezvous  with  Lead  Through  Unavoidable  Threats 
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swift  exit  from  the  exposure  region,  followed  by  a  direct  route  to  rendezvous  with  the 
manned  lead.  The  computation  time  for  the  hybrid  method  was  28.54  seconds,  where 
the  PSO  took  18.66  of  those  seconds.  Table  5.4  provides  a  quick  look  and  comparison 
of  computation  times  for  each  scenario. 

Table  5.1.  Computation  Time  for  Each  Scenario 


Scenario  PSO  Hybrid 

Fixed  Time  to  Fixed  Point  Avoidable  Threats  18.73  25.33 

Fixed  Time  to  Fixed  Point  Unavoidable  Threats  16.58  21.83 

Minimize  Time  to  Fixed  Point  Avoidable  Threats  33.68  42.68 

Minimize  Time  to  Fixed  Point  Unavoidable  Threats  32.87  37.66 

Minimize  Time  to  Lead  Rendezvous  Avoidable  Threats  22.99  34.83 

Minimize  Time  to  Lead  Rendezvous  Llnavoidablc  Threats  18.66  24.54 


5.5  3-D  Model  Conclusions  and  Recommendations 

In  this  chapter,  the  loyal  wingman  problem  formulation  was  updated  for  a  3-D 
model,  adding  real-world  fidelity  to  research  question  one.  The  loyal  wingman  PSO 
was  modified  in  order  to  produce  seeds  in  three  dimensions  as  well  as  to  compute 
results  using  a  reduced-order  dynamics  model.  For  all  scenarios,  the  NLP  optimizer 
returned  feasible  solutions  meeting  the  optimality  criteria  in  computation  times  rang¬ 
ing  between  20  and  45  seconds.  Overall,  the  computation  time  increased  from  the  2-D 
model  to  the  3-D  model  due  to  the  increase  in  state  variables  for  both  the  PSO  and 
the  NLP.  The  computation  time  of  the  PSO  was  offset  by  the  use  of  the  reduced-order 
model. 

As  discussed  in  Chapter  IV,  additional  research  should  consider  continued  im¬ 
provement  on  the  PSO  algorithm  in  order  to  increase  the  stochastic  search  region 
while  at  the  same  time  decreasing  the  overall  computation  time. 


VI.  Dynamic,  Non-Deterministic  Threats 


The  purpose  of  this  chapter  is  to  answer  research  question  two  by  developing  the 
model  and  solving  the  optimal  control  problem  for  moving,  stochastic  threats.  This 
is  done  by  first  establishing  assumptions  and  limitations  which  drive  the  development 
of  a  dynamics  model  and  a  measurement  update  model.  A  Kalman  filter  is  used  to 
estimate  the  threat  location  and  its  standard  deviation;  information  which  is  used  to 
model  the  threat-keep  out  region.  Finally,  a  simulation  is  run  which  shows  a  trajectory 
that  successfully  avoids  the  moving,  stochastic  threat  throughout  the  mission. 

6.1  Assumptions  and  Limitations 

There  are  numerous  aspects  of  the  problem  which  may  be  considered  moving  and 
stochastic.  It  is  assumed  the  loyal  wingman  itself  will  have  an  accurate  navigation 
system  and  inner-loop  controller  to  maintain  the  computed  trajectory.  It  is  addition¬ 
ally  possible  that  threats  as  well  as  the  intermediate  targets  and  rendezvous  points 
are  moving  and  stochastic.  The  work  herein  will  focus  on  the  moving,  stochastic 
threat  and  suggests  the  same  approach  may  be  applied  to  intermediate  targets  and 
rendezvous  points  in  future  research.  For  simplicity  it  is  assumed  the  discrete  coordi¬ 
nates  of  a  straight  road  are  known  a  priori  and  that  the  dynamic  threat  remains  on 
that  road.  Variations  of  this  assumption  are  discussed  as  topics  for  future  research. 

6.2  Dynamic  Threat  Model 

Referring  to  Figure  6.1,  the  coordinates  along  a  straight  road  (x r{J) ,  y r{j) ,  Zr(j)), 
Vj  =  1,2,  ...,M  are  known  a  priori  and  the  dynamic  threat,  driving  along  the  road, 
remains  on  the  road.  This  allows  for  the  position,  (xt,  Vt,  Zt),  of  the  moving,  stochas¬ 
tic  threat  to  be  estimated  using  a  1-D  distance-only  model.  Prior  to  any  estimation, 
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Figure  6.1.  Coordinates  and  Distance  Along  a  Known  Road 


the  azimuth  of  the  road  may  be  computed  as 


Or  —  Ot  —  tan  1 


■y{M)-y{  1)- 

. x(M )  —  rr(l)- 


(6.1) 


and  the  distance  along  the  road,  dR(j),  is  computed  as 

3 

Or (j )  =  ^2  V7 ( dx(k )2  +  dy(k)2  +  dz(k)2),  (6.2) 

k= 1 

Vj  =  1,  2, M  -  1,  where  rfx(/c)  =  +  1)  -  xR(k),  dy(k)  =  yR(k  +  1)  -  yR(k)% 

and  dz{k)  =  zR{k  +  1)  —  zR(k),  \/k  =  1,  2, ...,  M  —  1. 

Referring  to  Figure  6.2,  the  dynamic,  non-deternrinistic  distance  of  the  threat 
along  the  road  is  modeled  using  a  stationary  first-order  Gauss-Markov  acceleration 
with  zero-mean,  white  Gaussian  noise  strength  E[w(t)w(t  +  Ta)}  =  QS(ra),  where  5 
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Figure  6.2.  Block  Diagram  of  Threat  Dynamics  Model 


is  the  Kronecker  delta  function  and  Ta  is  the  autocorrelation  coefficient.  In  order  to 
achieve  the  process  noise  strength  <x2,  Q  is  set  to  the  desired  value  using  [144], 
Modified  from  Stankovic  [11]  and  Smith  [8]  to  meet  the  assumptions  of  the  loyal 
wingman  problem,  the  discrete  linear  state-space  model  is  shown  in  Equation  6.3, 


xi+\  =  4>x(i)  +  w(i) 


1  At 
0  1 


(qAt— l+e~aAt) 


(1— e~aAt) 


Xi  +  Wi , 


0 


0 


e-aAi 


(6.3) 


where  a  —  4-  and  Q  is  discretized  to  Qd  through  the  vanLoan  method  [130]  and  the 
three  states  represent  distance,  velocity  and  acceleration  of  the  threat.  Because  this 
is  a  linear  model  and  Gaussian  probability  assumptions  can  be  made  on  the  estimate, 
a  Kalman  filter  is  used,  as  in  Brown  [130], 


x(i  +  1)  =  (f>x(i)  (6.4a) 

P(i  +  1)  =  (pP{i)<pT  +  Qd,  (6.4b) 

Vi  =  1,2, ...,  N  time-steps  to  estimate  the  distance  the  threat  has  traveled  dr(i)  and 
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a  standard  deviation,  cr-rfy).  A  limitation  of  the  practical  application  of  this  model 
is  that  velocity  may  grow  beyond  possible  speeds  of  the  threat.  Future  work  should 
test  the  model  to  see  if  an  unbound  velocity  growth  is  an  issue.  If  it  is  an  issue,  then 
a  recommended  variation  of  the  model  is  to  place  a  limit  on  velocity  growth  and, 
due  to  the  nonlinearity  imposed  by  the  limit,  utilize  an  EKF  as  an  estimating  tool. 
All  information  necessary  for  modeling  the  threat  and  evaluating  threat  exposure  at 
any  given  time  is  now  available  and  when  interpoloated  ensures  a  continuous  and 
differentiable  function:  Given  a  discrete  time  t(i)  in  the  loyal  wingman’s  trajectory, 
the  estimated  distance  the  threat  has  traveled  drp{i)  and  standard  deviation  crT(i) 
may  be  evaluated.  Once  the  estimated  distance  traveled  is  evaluated,  it  may  then 
be  used  to  compute  the  distance-parameterized  position  (xT(i),  yT(i),  2T(i)),  of  the 
threat  along  the  road. 

6.3  Modeling  Dynamic  Threat  Avoidance  Region 

Using  super  quadrics,  the  parameters  necessary  to  model  the  threat  region  are 
the  global  coordinates  of  the  threat  center  location,  (xt,  Vti  %t),  the  principal  axes 
of  the  threat,  (ax,ay,az),  where  ax  is  a  local  coordinate  aligned  with  the  road,  ay 
is  perpendicular  to  the  road  and  az  is  vertical,  the  shape-forming  parameter  p  and 
a  rotation  matrix  M3x3(0T)  used  to  rotate  between  the  local  and  global  coordinate 
frames. 

In  the  global  coordinate  frame,  the  estimated  threat  position  (xr(i) ,  yrii) ,  zr(i))  at 
any  time  t(i)  are  determined  consistent  with  the  discussion  in  the  previous  subsection. 

There  are  two  parameters  necessary  for  modeling  the  size  and  shape  of  the  threat 
keep-out  region.  The  size  of  a  deterministic,  static  threat  is  established  using  the 
principal  axes  lengths,  ax ,  ay,  az.  The  size  of  the  stochastic,  moving  threat  is  adjusted 
by  extending  length  along  the  axis  which  the  threat  is  traveling  by  the  standard 
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deviation,  crT(i),  such  that  the  x-axis  length  is 


ax(i)  =  ax(  0)  +  aT(i), 


(6.5) 


Vi  =  1,2,  The  shape  of  a  deterministic  static  threat  is  formed  based  on  an 

initial  shaping  parameter,  p(0)  =  2,  to  form  an  ellipsoid.  The  shaping  parameter 
varies  for  stochastic,  moving  threats  by 

p(0  =  p(  0)  +  VS-  (6-6) 

ax(0) 

Vi  =  1,  2, ...,  N  to  form  a  cylindrical-like  shape  whose  long  axis  is  extended  along  the 
road  by  Equation  6.5.  Additional  details  on  this  choice  to  model  the  threat  keep-out 
region  of  moving,  stochastic  threats  is  discussed  in  Appendix  D. 


6.3.1  Computing  Threat  Exposure. 

The  purpose  of  establishing  a  dynamic  threat  model  is  to  evaluate  discrete  points 
along  the  loyal  wingman  trajectory  to  evaluate  threat  exposure.  A  homogeneous 
transformation  matrix  may  be  calculated  at  discrete  time-steps  t(i),  V?’  =  1,2, 


R 


4x4 v ^ ) 


M3x3  XT3xl(i) 
0la;3  1 


(6.7) 


where 


xr(i) 


xT[i) 

2/t(*)  > 

zT(i) 


(6.8) 
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to  transform  the  loyal  wingman’s  global  coordinates  to  a  threat-centric  local  coordi¬ 
nate  frame  by 

xw(i) 
yw(i) 
zw(i) 

1 

through  which  the  exposure  at  each  discrete  point  in  the  loyal  wingman’s  trajectory 
may  be  evaluated  using  Equation  6.10, 

pW  +  f  yevai{i)\  2  +  / zevai(i) 

V  /  /  V 

The  special  form  of  Equation  6.10  represents  a  ‘growing’  threat  region  along  the 
x-axis. 

6.3.2  Measurement  Update  Model. 

It  is  expected  that  when  the  optimal  path  is  computed,  the  initial  condition  of 
the  non-deterministic,  dynamic  threats  will  be  known  and  the  propagate  portion  of 
the  Kalman  filter  will  be  run.  The  loyal  wingman  will  compute  the  optimal  path  to 
complete  the  mission  while  avoiding  or  minimizing  exposure  to  the  threats  throughout 
the  mission,  recognizing  that  the  threat  keep-out  region  changes  at  each  time  step. 

After  the  original  path  is  planned,  the  loyal  wingman  begins  its  flight  along  the 
computed  path,  however  an  assumed  on-board  sensor  provides  updated  Cartesian 
coordinates  of  the  threat  location  with  a  known  margin  of  error  at  each  discrete 
time-step.  The  distance  along  the  road  associated  with  the  sensor’s  measurement  is 
determined  by  table  lookup  from  the  road  coordinates  known  a  priori.  The  measure¬ 
ment  model  is  then 

z(i)  —  h(x(i))  +  v{i)  (6.11) 


%ev al  (f  )  \ 

ax(i)  ) 
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where  u(i)  is  zero-mean  white  Gaussian  noise  with  strength  E[v(t)v(t)\  =  R8,  where 
R  indicates  the  uncertainty  associated  with  the  sensor.  With  the  values  H  —  h  —  1 
and  i?  =  r  =  cTg,  the  Kalman  filter  measurement  update  equations  are  calculated  in 
combination  with  Equation  6.4  found  in  Brown  [130], 

K{i)  =  p-(i)HT[HP~(i)HT  +  R]~l  (6.12a) 

x(i)  =  x~(i)  +  K(i)  [z(i )  —  Hx~(i)]  (6.12b) 

P[i)  =  [I  -  K{i)H]P~(i).  (6.12c) 

Measurement  updates  may  be  provided  at  an  identified  desired  rate  which,  when 
used  with  the  Kalman  filter,  will  provide  a  better  estimate  of  the  distance  a  threat 
may  have  traveled.  As  measurement  update  rates  increase,  the  estimation  accuracy 
improves. 

6.4  Results 

Two  scenarios  are  simulated  whose  results  demonstrate  the  tools  discussed  in  this 
chapter.  Figures  6.3  and  6.4  show  the  results  of  two  scenarios,  1.)  minimize  time  to 
fixed  location,  and  2.)  minimize  time  to  rendezvous  with  manned  lead.  Both  scenarios 
include  a  moving,  dynamic  threat  that  must  be  avoided.  Plot  properties  are  the  same 
as  discussed  in  Chapter  V.  The  starting,  intermediate  and  final  condition  points  are 
the  red  ‘circle’,  black  ‘x’,  and  blue  ‘star’,  respectively.  The  loyal  wingman’s  path  is 
indicated  in  blue  and  the  lead  path  in  the  appropriate  scenario  is  marked  in  red.  A 
dashed  line  indicates  a  planned  path  and  a  solid  line  indicates  a  flown  path.  Known 
static  threats  are  indicated  as  dark  shaded  regions.  In  subsequent  subplots,  the 
red  shaded  region  indicates  the  propagated  threat  keep-out  region,  which  changes  in 
both  size  and  shape,  according  to  Equations  6.5  and  6.6  at  each  time-step.  Referring 
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to  Figure  5.2,  additional  plot  properties  include  two  sub-axes.  The  first  inset  plot 
indicates  the  loyal  wingman’s  flown  path  exposure,  calculated  using  the  logarithm 
of  Equation  5.21.  Values  less  than  zero  indicate  increased  risk  of  exposure.  The 
second  inset  of  subplots  indicates  the  altitude  of  the  loyal  wingman.  All  altitude  plots 
indicate  a  starting  position  of  15km,  intermediate  waypoint  of  2km  and  final  altitude 
of  15km,  matching  the  specified  constraints.  In  many  cases,  it  may  appear  from  a  2-D 
display  that  the  loyal  wingman  trajectory  is  going  through  a  threat,  when  in  reality, 
the  trajectory  passes  at  an  altitude  higher  than  the  threat  altitude  (as  shown  in  the 
exposure  inset  subplot  remaining  greater  than  zero).  Figure  2.1  provides  an  image 
of  the  3-D  contour  of  the  superquadric  threat  keep-out  region.  The  DOC  method 


(c)  Point  of  Closest  Approach  to  Propagated  (d)  End  of  Mission 

Threat 


Figure  6.3.  Trajectory  Simulation  to  Avoid  Moving,  Stochastic  Threat  with  Fixed 
Point  Rendezvous 


used  40  nodes  in  each  of  two  phases,  for  a  total  of  80  nodes.  Figure  6.3  represents 

Equation  5.2  may  grow  without  bound,  so  the  logarithm  is  used  to  scale  results  for  readability. 
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the  minimize  time  to  fixed  location  scenario.  Subplot  (a)  indicates  the  trajectory 
prior  to  beginning  the  simulation.  The  computed  trajectory  is  based  on  avoiding  the 
threat  throughout  the  mission.  Subplot  (b)  provides  a  view  of  the  threat  situation 
after  the  loyal  wingman  has  flown  approximately  half  of  the  mission.  As  can  be  seen, 
the  propagated  threat  keep-out  region  is  much  larger  than  the  threat  keep-out  region 
indicated  by  the  measurement  update  at  that  time-step.  This  subplot  also  reveals 
that  a  dynamic  re-plan  of  the  trajectory  may  improve  the  current  planned  route, 
which  is  a  motivation  for  the  dynamic  re-plan  research  discussion  in  the  next  chapter. 
Subplot  (c)  displays  the  point  of  closest  approach  to  the  propagated  threat  region.  In 
the  final  subplot  (d),  the  loyal  wingman  has  surpassed  the  threat  and  rendezvoused 
at  the  specified  fixed  final  position  in  minimum  time.  The  computation  time  for  this 
scenario  included  25.6  seconds  for  the  PSO  and  6.9  seconds  for  DOC  method,  for  a 
total  of  32.5  seconds.  Figure  6.4  represents  minimize  time  to  rendezvous  with  the 
lead.  Here,  the  trajectory  of  the  manned  lead  is  known,  and  the  loyal  wingman  must 
overfly  the  target,  avoid  the  threat  and  rendezvous  (match  position  and  heading) 
in  minimum  time.  The  four  subplots  are  similar  to  the  discussion  above  only  the 
final  boundary  condition  is  changed  to  reflect  rendezvous  with  the  manned  lead  in 
minimum  time.  The  computation  time  for  this  scenario  included  21.9  seconds  for  the 
PSO  and  6.1  seconds  for  DOC  method,  for  a  total  of  28  seconds. 

As  a  final  point  of  discussion  for  this  section,  this  work  uses  the  GPOPS  II  Matlab- 
based  software  to  transcribe  the  optimal  control  problem  into  an  NLP,  which  includes 
establishing  a  first  and  second  derivative  matrix.  Patterson  and  Rao  [145]  discuss  the 
creation  of  these  matrices  and  describe  a  method  to  exploit  their  sparseness  for  im¬ 
proved  computation  time.  The  method  is  provided  as  an  option  in  the  GPOPS  II 
software,  which  worked  well  for  a  static  threat  environment.  However,  there  were 
challenges  with  convergence  when  using  this  option  in  a  dynamic  threat  environment. 
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(a)  Trajectory  Based  on  Propagated  Threat 


(b)  Trajectory  Avoiding  Propagated  Threat 


East(km) 


(c)  Point  of  Closest  Approach  to  Propagated  (d)  End  of  Mission 

Threat 


Figure  6.4.  Trajectory  Simulation  to  Avoid  Moving,  Stochastic  Threat  with  Lead  Ren¬ 
dezvous 
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A  review  of  Patterson  and  Ran  [146]  may  provide  insight  on  the  cause  of  this  chal¬ 
lenge.  The  sparsity  exploitation  tool  was  therefore  not  utilized  in  the  loyal  wingman 
optimal  control  problem,  resulting  in  no  noticeable  difference  in  computation  time  in 
a  static  threat  environment,  while  achieving  feasible  solutions  in  the  dynamic  threat 
environment.  Additional  information  on  the  sparsity  of  matrices  can  be  found  in  Pat¬ 
terson  and  Rao  [145]  and  the  method  used  by  the  GPOPS  If  software  for  exploiting 
the  sparsity  is  described  in  Patterson  and  Rao  [146]. 

6.5  Non-Deterministic  Threat  Conclusions  and  Recommendations 

This  chapter  answered  research  question  two  by  developing  a  dynamic  and  mea¬ 
surement  update  model  as  well  as  the  use  of  a  Kalman  filter  to  estimate  the  location  of 
a  moving,  stochastic  threat.  The  keep-out  region  of  the  stochastic  threat  is  modeled 
using  two  parameters.  The  first  is  a  time-dependent  superquadric  shaping  parameter 
to  shape  the  threat  keep-out  region  as  a  cylinder.  The  second  defines  the  size  of  the 
cylindrical  shape  by  extending  its  long  axis  along  the  axis  of  the  road  according  to  the 
estimated  threat  speed.  A  homogeneous  transformation  matrix  is  used  to  transform 
the  loyal  wingman’s  current  location  in  global  coordinates  to  a  local  coordinate  frame 
for  evaluating  the  trajectory  exposure.  A  simulation  was  run  and  results  plotted  which 
show  the  growth  of  the  threat  keep-out  region  as  well  as  the  generated  path  which 
successfully  avoids  the  expanded  threat  keep-out  region  throughout  the  mission.  The 
next  chapter  will  introduce  a  time  to  re-plan  formulation  that  will  re-compute  the 
trajectory  during  mission  flight,  based  on  updated  threat  information. 

The  objective  was  to  model  the  moving,  stochastic  threat  keep-out  region  and 
provide  results  showing  the  hybrid  methodology  can  provide  a  rapid,  feasible  solution. 
In  order  to  do  so  a  number  of  simplifying  assumptions  were  made.  Future  research 
should  consider  variations  to  the  assumptions  of  this  chapter.  First,  the  road  may 
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be  considered  non-straight,  suggesting  a  threat  region  may  be  modeled  using  Smith’s 
SLIMVEE  algorithm  to  create  non-straight  threat  avoidance  corridors  [8].  Second, 
the  linear  1-D  model  may  be  extended  to  two  or  three  dimensions  and  as  long  as  the 
model  is  linear  and  Gaussian  probability  assumptions  are  made,  the  Kalman  filter 
may  be  utilized  to  estimate  location.  In  addition  to  increasing  the  dimension  of  the 
threat  stochastics  model,  the  current  model  allows  for  a  velocity  estimate  that  may 
grow  beyond  possible  speeds  of  the  threat.  Future  work  should  test  the  impact  of  the 
current  velocity  model  and  test  the  use  of  a  nonlinear  limit-imposed  velocity  model. 
Because  dynamic  threat  estimation  is  computed  outside  of  the  NLP,  changes  in  the 
model  have  minor  impacts  on  the  gradient-based  NLP  solver.  Changes  in  the  dynamic 
model  can  be  addressed  by  using  other  available  and  proven  estimating  tools  such  as 
the  extended  Kalman,  unscented  Kalman  and  particle  filter. 
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VII.  Dynamic  Re-Plan 


The  purpose  of  this  chapter  is  to  answer  research  question  three  by  determining  if 
and/or  when  a  dynamic  re-plan  of  the  loyal  wingman  trajectory  is  necessary.  There 
are  two  scenarios  that  will  be  considered.  The  first  scenario  assumes  an  original 
trajectory  has  been  computed,  and  the  mission  is  being  flown,  when  an  intelligence 
source  provides  updated  information  indicating  a  change  in  the  mission  environment 
such  as  a  ‘pop-up’  threat  or  a  change  in  rendezvous  criteria.  A  trajectory  re-planning 
algorithm  must  determine  whether  this  change  in  the  mission  environment  warrants  a 
re-plan  and  generally  will  result  in  an  immediate  re-plan  of  the  loyal  wingman  trajec¬ 
tory  to  meet  the  updated  mission  requirements.  The  second  scenario  for  trajectory 
re-planning  considers  the  presence  of  a  moving,  stochastic  threat.  For  such  a  sce¬ 
nario,  the  work  herein  develops  an  algorithm  to  determine  if  a  re-plan  is  necessary, 
and  subsequently,  a  time  to  re-plan,  tr,  based  on  a  critical  distance  metric,  p,  that 
uses  the  assumed  speed  and  stochastics  of  the  threat,  as  well  as  relative  distance  and 
angle  of  approach  to  the  threat. 

7.1  Mission  Flow  and  Changes  in  Mission  Environment 

The  overall  mission  re-planning  flowchart  is  shown  in  Figure  7.1.  The  left  side 
represents  computation  of  the  loyal  wingman  trajectory,  and  in  the  presence  of  mov¬ 
ing,  stochastic  threats,  recommends  a  time  to  re-plan  the  loyal  wingman  trajectory, 
t,ji.  The  right  side  represents  a  mission  flight,  which  steps  through  the  trajectory  at 
discrete  time-steps,  evaluating  whether  a  mission  re-plan  is  necessary  based  on  ‘pop¬ 
up’  threats  or  mission  changes.  Each  step  in  the  flowchart  is  numbered  to  facilitate 
discussion. 

The  algorithm  is  initialized  (box  2)  with  parameters  passed  from  the  manned  lead 
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or  other  intelligence  sources.  With  parameters  initialized,  bounds  are  established  and 
if  there  is  a  dynamic  threat,  the  distance  traveled  by  the  threat  dp  and  a  standard 
deviation  gt  are  estimated  through  a  Kalman  filter  propagation  (box  3),  as  discussed 
in  Chapter  VI.  This  information  is  passed  into  a  Particle  Swarm  Optimization  (PSO) 
(box  4)  algorithm  [147]  and  the  optimal  control  problem  is  transcribed  into  a  nonlin¬ 
ear  program  (NLP).  The  output  from  the  PSO  is  provided  as  an  initial  guess  into  the 
NLP  solver  and  an  initial  outer-loop  trajectory  is  computed  (box  5).  If  a  dynamic 
threat  is  present  (box  6),  then  an  algorithm  (boxes  9  and  10),  described  in  the  next 
section,  determines  if  and  subsequently  when  (t#,  time  to  re-plan)  the  loyal  wing- 
man’s  trajectory  should  be  re-planned.  The  mission  flow  then  moves  from  trajectory 
computation  on  the  left  side  of  Figure  7.1  to  mission  flight  on  the  right  side. 

At  each  discrete  time-step,  there  is  a  check  to  see  if  there  is  a  change  in  the  mission 
environment.  If  a  new  threat  has  ‘popped-up’  (box  12),  then  the  loyal  wingman’s  dis¬ 
cretized  trajectory  from  the  current  time  forward  is  evaluated,  via  Equations  6.5,  6.6, 
and  6.10,  to  determine  if  the  new  threat  will  cause  exposure.  If  at  any  discrete  point 
in  the  trajectory,  the  value  of  the  inside-outside  function,  F,  is  less  than  one,  indicat¬ 
ing  a  point  in  the  interior  of  the  threat  region,  then  the  loyal  wingman  risks  exposure 
and  the  current  path  should  be  re-planned  (box  13).  If  any  F  is  greater  than  one,  then 
no  trajectory  update  is  needed.  In  addition  to  pop-up  threats,  there  is  a  check  to  see 
if  there  is  a  change  in  the  mission  requirements  (box  16),  such  as  a  new  intermediate 
target  or  new  final  rendezvous,  as  well  as  a  check  to  see  if  the  current  time  t{i)  has 
reached  the  previously  determined  time  to  re-plan  tr  (box  17). 

If  any  of  the  checks  (boxes  13,  16,  17)  indicate  that  a  re-plan  of  the  loyal  wingman 
trajectory  is  necessary,  a  sensor  measurement  update  on  the  current  location  of  the 
threat  is  taken  (box  14),  new  initial  conditions  are  established  (box  15),  and  the  pro¬ 
cess  flows  back  to  the  left  side  of  Figure  7.1  (box  3).  The  entire  process  of  computing 
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and  flying  the  trajectory  is  repeated  until  the  mission  (or  simulation)  is  complete. 


7.2  Moving,  Non-Deterministic  Threats:  Time  to  Re-Plan,  tr 

This  section  describes  Figure  7.1,  boxes  9  and  10  to  determine  if  and/or  when 
a  time  to  re-plan,  tr,  should  occur  in  the  presence  of  a  moving,  non-detcrministic 
threat. 

Determine  IF  a  Re-Plan  is  Necessary. 

Just  because  there  is  a  threat  present  (stationary  or  moving),  does  not  mean  the 
computed  trajectory  is  impacted  by  the  presence  of  the  threat.  As  an  example,  if  the 
loyal  wingman’s  mission  is  to  the  east  and  the  threat  is  in  the  west  (and  in  the  case 
of  a  dynamic  threat,  moving  west),  then  the  presence  of  a  threat  would  not  impact 
the  computed  trajectory.  Therefore,  the  algorithm  must  first  determine  whether  the 
originally  computed  trajectory  is  impacted  by  the  presence  of  the  threat.  This  is 
done  by  evaluating  Equations  6.5,  6.6,  and  6.10  at  all  time-steps  of  the  loyal  wing- 
man’s  trajectory.  If  the  threat  exposure  at  any  discrete  point  is  below  an  established 
threshold  F(i)  <  F*,  evaluated  Vi  =  1,2,...,  AT  then  the  loyal  wingman’s  trajectory 
is  in  the  vicinity  of  the  threat.  As  described  in  Chapter  III,  depending  on  the  user 
choice  of  the  stiffness  value  s,  it  is  likely  the  optimizer  will  converge  on  a  trajectory 
in  which  F  >  1. 

Determining  a  Time  to  Re-Plan,  t r. 

After  determining  that  the  computed  loyal  wingman  trajectory  was  influenced  by 
the  presence  of  a  threat,  a  time  to  re-plan  must  be  computed.  If  a  loyal  wingman 
trajectory  re-plan  occurs  too  early,  then  little  has  changed  in  the  threat  scenario  and 
the  newly  computed  trajectory  will  have  changed  an  insignificant  amount  from  the 
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Figure  7.2.  Angle  of  Approach  (a)  to  Threat  Vector  Normal 

previous  trajectory.  However,  if  the  suggested  time  to  re-plan  is  much  later  it  could 
place  the  loyal  wingman  very  close  to  or  past  the  threat,  in  which  case  there  is  no 
longer  a  need  to  update  the  trajectory.  It  is  therefore  imperative  to  determine  a  time 
to  re-plan  that  considers  the  speed  and  stochastics  of  the  moving  threat  as  well  as 
the  relative  distance  and  angle  of  approach  of  the  loyal  wingman  to  the  threat. 

Referring  to  Figure  7.2,  this  work  develops  the  critical  distance  metric,  p  (defined 
in  Equation  7.4),  suggesting  a  re-plan  should  occur  when  the  loyal  wingman’ s  relative 
distance  to  the  threat,  H(i), 


H{%)  =  ||xw(i)  -  xj'(i)  || 2, 

is  less  than  p.  The  time  to  re-plan,  tr,  can  then  be  determined  by 


,A  t 


(7.1) 


(7.2) 
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where 


imin  =  minji  :  H(i)  <  p{i)} 


(7.3) 


evaluated  Vi  =  1,2,  ...,1V.  The  critical  distance  metric,  p,  is  developed  using  Equa¬ 
tion  6.5,  which  models  the  size  of  the  threat  keep-out  region  at  each  time-step  by 
extending  the  axis  along  the  road  by  the  standard  deviation  at  that  time-step.  Us¬ 
ing  Equation  6.5  alone,  the  time  to  re-plan  would  occur  when  the  loyal  wingman’s 
trajectory  coincides  with  the  threat  keep-out  region  in  a  direct,  ‘head-on’  approach. 
Because  the  loyal  wingman’s  trajectory  was  calculated  based  on  minimizing  expo¬ 
sure,  it  is  unlikely  this  distance  will  be  achieved  in  a  direct  approach.  Therefore,  the 
minimum  length  of  the  principal  axis  of  the  threat  region,  Equation  6.5,  is  extended 
by  a  scaled  standard  deviation,  £(i)cr(i)  leading  to  the  critical  distance  metric,  p(i), 
defined  as: 

p(i)  —  °x(0)  +  0r(i)[l  +  £(*)],  (7.4) 


where  £(i)  represents  a  speed,  distance,  and  angle  of  approach  ratio,  defined  as: 


£(<) 


xT(*)  -xt(0)||2 

H(i) 


cos  a(i)) 


(7.5) 


There  are  three  factors  used  to  develop  £(i)  in  Equation  7.5,  which  are  now  dis¬ 
cussed.  ||xr(i)  —  xx’(O)  || 2  represents  the  distance  the  threat  has  traveled  from  the 
start  of  the  mission  or  the  last  re-plan.  A  fast  threat  will  trend  £  to  a  higher  value, 
whereas  a  slow  threat  will  tend  to  keep  £  small,  which  has  a  direct  relationship  to  the 
increase  or  decrease  of  p  in  Equation  7.4. 

The  second  factor  in  Equation  7.5  is  the  relative  distance  from  the  loyal  wingman 
to  the  estimated  threat  position,  77,  defined  in  Equation  7.1.  Using  an  inverse  rela¬ 
tionship,  if  the  relative  distance  decreases,  then  the  likelihood  of  triggering  a  re-plan 
increases.  If,  on  the  other  hand,  H  increases,  the  likelihood  of  triggering  a  re-plan 
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decreases. 


The  third  factor  in  Equation  7.5  is  the  relative  angle  of  approach,  calculated  as 
the  dot  product  of  the  velocity  vectors  between  the  wingman  and  the  center  of  the 
threat  region, 


cos  a(i) 


IE(i)Tf(i) 

PwiiPwiT 


(7.6) 


Equation  7.6  produces  values  on  the  interval  [—1,1].  A  value  of  either  —1  or  1 
indicates  a  ‘head-on’  approach.  It  does  not,  however,  indicate  whether  the  vehicles  are 
traveling  toward  each  other  (decreasing  H)  or  away  from  each  other  (increasing  H). 
Therefore  Equation  7.5  takes  the  absolute  value  such  that  the  interval  is  [0, 1].  With 
this  formulation,  a  ‘head-on’  approach  scales  £  to  1  and  a  perpendicular  approach 
scales  £  to  0. 

It  is  assumed  the  speed  of  the  loyal  wingman  is  much  greater  than  the  speed  of  the 
threat.  As  a  result,  after  the  loyal  wingman  has  overcome  the  threat,  the  distance, 
H,  increases  at  a  rate  fast  enough  that  Equation  7.2  is  not  satisfied  and  no  re-plan  is 
triggered. 

Appendix  E  provides  a  simulation  that  demonstrates  the  effects  relative  distance 
(Equation  7.1),  the  critical  distance  metric  (Equation  7.4),  relative  speed  and  distance 
ratio  (Equation  7.5),  and  angle  of  approach  (Equation  7.6)  have  on  determining  a  time 
to  re-plan. 


7.3  Results 

A  scenario  was  established  which  included  using  the  hybrid  optimization  tech¬ 
nique  to  solve  the  optimal  control  problem  as  well  as  using  the  techniques  discussed 
throughout  this  work  to  dynamically  re-plan  the  optimal  control  problem  when  a  non- 
deterministic,  moving  threat  ‘pops-up’.  Referring  to  Figure  7.3,  the  plotting  proper¬ 
ties  are  identical  to  the  description  of  Chapter  VI.  Subplot  (a)  provides  an  initially 
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(a)  T=2.25:  Original  Trajectory 


(b)  T=3:  Pop-up  Dynamic  Threat 


(c)  Future,  T=17:  Trajectory  Exposed 


(d)  Future  T=18,  Re-plan  1  at  PCA 
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(e)  T=14:  Time  to  Update 
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(f)  T=17:  Re-plan  2  at  PCA 
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(g)  T=18.5:  Pop-up  Static  Threat,  No  Re-plan  (h)  T=26:  Approach  Lead  Rendezvous 


Figure  7.3.  Demonstration  of  Re-Plan  in  Changing  Mission  Environment 
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computed  trajectory  for  the  loyal  wingman  to  break  from  formation,  accomplish  the 
intermediate  target  and  rendezvous  with  the  manned  lead.  In  subplot  (b),  a  dynamic 
threat  ‘pops-up’  and  although  it  may  appear  the  trajectory  will  not  be  exposed,  once 
the  threat  keep-out  region  has  been  propagated,  a  future  look  in  subplot  (c)  reveals, 
there  is  a  risk  of  exposure.  Therefore,  a  re-plan  must  occur.  A  new  trajectory  is 
calculated  and  subplot  (d)  provides  a  future  look  of  the  newly  computed  trajectory, 
revealing  the  risk  of  exposure  is  low  at  the  point  of  closest  approach  (PCA)  to  the 
propagated  threat.  When  the  re-plan  occurs  at  T  =  3,  the  algorithm  additionally 
computes  a  time  to  re-plan,  tr  =  14.  The  loyal  wingman  flies  until  T  =  tr,  subplot 
(e),  and  a  second  re-plan  occurs.  The  new  trajectory  is  shown  in  subplot  (f)  at  the 
point  of  closest  approach  to  the  propagated  threat.  In  subplot  (g),  a  static  threat 
‘pops-up’,  but  because  it  does  not  pose  a  threat  to  the  loyal  wingman’s  trajectory,  no 
re-plan  occurs.  Finally,  in  subplot  (h),  the  loyal  wingman  is  in  its  final  approach  for 
rendezvous  with  the  manned  lead. 

7.4  Dynamic  Re-Plan  Conclusions 

A  mission  re-planning  flowchart  was  provided  in  Figure  7.1  and  fully  described, 
which  includes  a  formulation  to  determine  if  a  re-plan  is  necessary  as  well  as  an  ap¬ 
propriate  time  to  update  the  loyal  wingman’s  trajectory  when  the  mission  scenario 
includes  moving,  non-detcrministic  threats.  The  time  to  re-plan  is  computed  after 
the  loyal  wingman’s  trajectory  has  been  established,  prior  to  flight,  and  is  deter¬ 
mined  by  the  critical  distance  metric  developed  herein,  which  used  assumed  threat 
speed  and  stochastics  as  well  as  the  loyal  wingman’s  relative  distance  and  angle  of 
approach  to  the  threat.  The  work  is  demonstrated  through  simulation  and  shows  a 
successful  minimum  time  rendezvous  with  the  manned  lead.  The  simulation  addi¬ 
tionally  demonstrates  the  ability  to  adjust  the  original  trajectory  when  a  dynamic 
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threat  pops-up,  as  well  as  update  the  trajectory  according  to  the  critical  distance 
metric.  A  natural  extension  of  this  work  may  consider  re-planning  in  the  midst  of 
multiple  dynamic  threats  as  well  as  additional  intermediate  targets.  Additionally, 
the  time  to  re-plan,  tr,  is  computed  with  no  consideration  to  computation  time.  A 
more  accurate  representation  of  the  computation  time  will  be  known  after  flight  test 
hardware  has  been  determined  and  the  system  flown.  Future  work  should  reduce  the 
time  to  re-plan,  tr,  by  the  estimated  computation  time  of  the  algorithm. 
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VIII.  Conclusions  and  Recommendations 


This  work  sought  to  contribute  to  the  scientific  and  engineering  community  by 
developing  a  methodology  and  set  of  research  questions  that,  when  answered,  would 
aid  in  developing  the  DoD  concept  of  MUM-T,  a  subset  of  which  is  the  loyal  wingman. 
A  specific  definition  and  candidate  scenarios  for  the  loyal  wingman  were  established  to 
appropriately  scope  the  research.  It  was  suggested  that  optimal  control  and  stochastic 
estimation  techniques  could  be  used  to  solve  the  loyal  wingman  problem  as  defined 
herein.  A  methodology  was  proposed  and  a  set  of  research  questions  generated, 
which,  when  answered,  successfully  demonstrated  the  methodology  and  provided  a 
contribution  to  the  body  of  knowledge.  The  research  also  led  to  key  findings  and 
recommendations  highlighted  in  this  chapter. 

8.1  Research  Questions 

1.  The  optimal  control  problem  was  formulated  and  solved  for  a  static 
threat  environment.  The  optimal  control  problem  was  formulated  using  both 
2-D  and  3-D  models.  The  problem  was  divided  into  two  phases  and  solved  us¬ 
ing  direct  orthogonal  collocation.  Multiple  boundary  condition  scenarios  were 
established,  including  fixed  time  to  fixed  location,  minimize  time  to  fixed  lo¬ 
cation,  and  minimize  time  to  rendezvous  with  time-dependent  manned  lead’s 
path.  Threats  were  modeled  as  superquadrics  due  to  the  low  number  of  parame¬ 
ters  necessary  for  generating  various  sizes  and  shapes  which  aids  in  maintaining 
a  low  communication  bandwidth  as  well  as  computational  efficiency.  In  order 
to  account  for  both  avoidable  and  unavoidable  (target  within  threat  region) 
threat  scenarios,  the  threats  were  included  in  the  cost  functional  to  minimize 
time  of  exposure  using  a  modified  inside-outside  product  function  in  addition  to 
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time  and  control.  To  ameliorate  the  issue  of  convergence  to  local  minimums,  a 
hybrid  optimal  control  methodology  was  demonstrated  which  used  a  heuristic- 
based  optimization  technique  to  generate  an  initial  guess.  The  guess  was  then 
supplied  to  the  direct  orthogonal  collocation  (DOC)-transcribed  gradient-based 
nonlinear  programing  (NLP)  solver  to  provide  a  rapid,  accurate  solution. 

2.  The  optimal  control  problem  was  formulated  and  solved  for  a  moving, 
stochastic  threat  environment.  A  1-D  linear  dynamic  and  measurement 
update  model  was  developed  to  mimic  a  threat  traveling  along  a  straight  road 
whose  location  and  standard  deviation  were  estimated  using  a  Kalman  Liter  and 
modeled  using  a  superquadric.  The  size  and  shape  of  the  threat  region  is  formed 
based  on  the  time-dependent  standard  deviation.  The  size  is  varied  by  extending 
the  axis  along  which  the  threat  is  traveling  by  the  standard  deviation  and 
the  shape  is  varied  from  ellipsoidal  to  cylindrical  by  varying  the  superquadric 
shaping  parameter  by  the  standard  deviation  as  well.  Using  these  models,  the 
optimal  control  problem  was  solved  to  ensure  the  loyal  wingman  trajectory 
minimized  threat  exposure  throughout  the  mission  when  the  location  of  the 
threat  at  all  time-steps  is  not  known. 

3.  A  method  was  developed  for  determining  if  and  when  the  trajec¬ 
tory  should  be  re-planned  in  a  changing  mission  or  dynamic  threat 
environment.  A  mission- flow  diagram  was  generated  which  includes  1.)  gen¬ 
eration  of  flight  trajectory  using  methods  resulting  from  research  questions  one 
and  two,  and  2.)  mission  flight  (or  simulation).  After  a  trajectory  is  computed, 
if  a  dynamic  threat  is  present  and  its  presence  impacts  the  computed  trajectory, 
a  time  to  re-plan  is  generated  based  on  the  critical  distance  metric  which  uses 
threat  speed,  direction,  and  stochastics  as  well  as  the  loyal  wingman’s  relative 
distance  and  angle  of  approach  to  the  threat.  After  the  trajectory  is  computed, 
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the  mission  is  flown.  Changes  in  the  mission  environment,  such  as  pop-up 
threats  are  evaluated  for  exposure  and  if  necessary  the  trajectory  is  re-planned 
immediately.  When  the  mission  flight  time  coincides  with  the  time  to  re-plan, 
a  measurement  update  is  taken,  initial  conditions  are  re-established  and  a  new 
trajectory  is  computed.  The  cycle  of  computing  a  trajectory  and  then  flying 
the  trajectory  is  repeated  until  the  mission  is  complete. 

Collectively,  this  work  demonstrated  a  methodology  for  solving  the  loyal  wing- 
man  optimal  control  problem  as  defined  herein  in  a  near  real-time  environment.  This 
optimal  control  methodology  may  also  be  used  as  a  comparison  to  evaluate  the  per¬ 
formance  of  other  methods. 

8.2  Contributions 

In  the  course  of  answering  the  research  questions,  several  specific  research  contri¬ 
butions  have  been  identified. 

1.  A  method  was  developed  to  formulate  and  solve  the  optimal  control  problem 
for  multiple  scenarios,  including  avoidable  and  unavoidable  threats  as  well  as 
multiple  boundary  condition  scenarios,  including  fixed  time  to  fixed  location, 
minimize  time  to  fixed  location,  and  minimize  time  to  rendezvous  with  the 
manned  lead. 

2.  A  method  was  developed  to  model  threats  which  is  useful  for  both  avoidable 
and  unavoidable  threat  scenarios.  Threats  were  modeled  and  formulated  in  the 
cost  function  using  the  modified  inside-outside  product  function  necessary  for 
use  in  the  gradient-based  nonlinear  programming  solver. 

3.  A  method  was  developed  for  estimating  the  location  of  a  moving,  stochastic 
threat.  A  1-D  model  was  established  and  estimation  was  then  used  to  model 
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the  changing  size  and  shape  of  the  threat  keep-out  region  at  all  time-steps. 


4.  A  method  was  developed  for  determining  a  time  to  re-plan  formulation  in  the 
presence  of  a  moving,  stochastic  threat.  The  time  to  re-plan  is  generated  based 
on  the  critical  distance  metric  which  uses  threat  velocity  and  stochastics  as  well 
as  the  loyal  wingman’s  relative  distance  and  angle  of  approach  to  the  threat. 

5.  A  hybrid  optimal  control  methodology  was  developed  and  demonstrated  to 
provide  rapid  and  accurate  optimal  solutions  for  the  loyal  wingman  application 
which  may  be  applied  to  other  applications  as  well. 

8.3  Future  Research  Recommendations 

In  addition  to  identifying  contributions,  the  research  uncovered  a  number  of  key 
findings  relevant  for  future  research. 

8.3.1  Improve  the  Particle  Swarm  Optimization  Algorithm. 

1.  The  research  herein  showed  that  the  hybrid  methodology  could  provide  a  rapid, 
accurate  solution.  Given  that  the  PSO  has  numerous  ‘knobs’  and  many  varia¬ 
tions  suggested  in  literature,  future  research  should  focus  solely  on  developing 
a  PSO  algorithm  that  simultaneously  increases  the  search  region  and  converges 
toward  a  global  region  in  a  more  timely  fashion.  Determining  a  good  initial 
guess  for  a  gradient-based  solver  continues  to  be  an  elusive  challenge  for  the 
optimal  control  community  and  the  PSO  could  be  an  answer  to  that  challenge. 

2.  The  work  herein  defined  the  PSO  particles  as  a  vector  of  discrete  control  inputs. 
Future  work  should  consider  defining  the  PSO  particles  differently.  Many  works 
in  literature  define  the  PSO  particles  as  the  coefficients  to  a  polynomial.  If  this 
were  done  in  the  loyal  wingman  PSO,  a  correct  polynomial  basis  set  that  aligned 
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itself  with  the  GPOPS  II  DOC  transcription  could  create  synergy  between  the 
two  methods  that  may  aid  in  a  more  rapid  solution. 

8.3.2  Extend  the  Moving,  Stochastic  Threat  Model. 

1.  The  work  herein  established  a  simplifying  assumption  that  the  road  was  straight. 
Future  research  should  consider  a  road  that  curves.  The  threat  modeling  in  this 
case  would  remain  1-D,  but  consideration  would  be  given  to  modeling  a  threat 
keep-out  regions  with  a  curved  primary  axis.  This  can  be  done  by  parameteriz¬ 
ing  the  trajectory  and  extending  the  shape  based  on  the  tangent  to  the  curve  or 
through  the  use  of  Smith’s  SLIMVEE  algorithm  [8]  to  establish  no-fly  corridors. 

2.  The  assumption  of  a  single  dimension  may  be  extended  to  2-D  or  3-D.  The 
threat  model  would  simply  extend  additional  superquadric  axes  by  the  standard 
deviation  in  the  appropriate  dimension.  As  long  as  the  dynamic  model  remains 
linear,  the  Kalman  filter  may  be  used.  If  the  dynamic,  stochastic  model  becomes 
nonlinear  or  Gaussian  assumptions  cannot  be  made,  there  are  other  estimation 
filters  which  may  be  utilized. 

8.3.3  Improve  Robustness. 

1.  Specific  threat  and  boundary  condition  scenarios  were  chosen  and  demonstrated 
in  order  to  highlight  the  results  of  the  methodology  for  avoiding  threat,  minimiz¬ 
ing  exposure,  avoiding  propagated  moving  threats,  and  re-planning  to  improve 
optimization  objectives.  Although  the  method  presented  here  is  prescriptive 
and  was  designed  to  work  in  multiple  threat  and  boundary  condition  layouts,  it 
has  not  been  fully  tested.  Future  work  should  test  the  robustness  of  the  method 
to  random  placement  of  threat  and  boundary  condition  scenarios. 

2.  The  work  herein  developed  a  methodology  for  planning  and  dynamically  re- 
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planning  a  trajectory.  Many  assumptions  were  made  on  other  technologies 
which  provide  inputs  to  and  accept  outputs  from  the  loyal  wingman  route¬ 
planning  algorithm.  The  algorithm  used  for  the  research  herein  which  provided  a 
method  for  solving  the  optimal  control  problem  should  be  re-built,  hosted  on  an 
appropriate  system  interface,  and  flown  in  a  system  flight  test.  A  more  accurate 
representation  of  the  computation  time  will  be  known  after  flight  test  hardware 
has  been  determined  and  the  system  flown.  A  flight  test  will  additionally  allow 
for  evaluation  of  the  performance  of  the  method  chosen  for  this  work  and  allow 
for  identification  of  areas  for  additional  research. 

8.4  Summary 

The  DoD  is  moving  forward  to  produce  appropriate  autonomy  in  systems  that 
work  effectively  and  synergistically  with  manned  counterparts.  This  research  has 
demonstrated  a  methodology  that  provides  optimal,  rapid  solutions  to  the  loyal  wing- 
man  optimal  control  problem  as  defined  herein  with  an  assumed  level  of  autonomy. 
Regardless  of  which  method  is  ultimately  chosen,  the  methodology  herein  is  an  op¬ 
timal  solution  that  can  aid  in  evaluating  other  dynamic  route-planning  algorithms. 
The  body  of  work  presented  here  is  a  foundation  upon  which  to  continue  integrating 
the  manned-unmanned  team. 
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Appendix  A.  2D  Deterministic  Method  to  Produce  Seeds 


For  an  optimal  control  problem,  a  particle  of  a  PSO  algorithm  is  a  vector  of 
discrete-time  control  inputs  that  produce  an  optimal  mission  path  with  respect  to  the 
identified  cost  function.  The  loyal  wingman  PSO  was  seeded  using  a  deterministic 
algorithm  with  a  goal  of  meeting  the  following  criteria:  meet  target  and  endpoint 
constraints,  represent  a  broad  range  of  possible  trajectories  to  aid  the  PSO’s  stochastic 
search,  and  achievable  in  a  computationally  efficient  manner. 

The  first  step  is  to  produce  a  set  of  possible  two-dimensional  curves  using  a  spline 
interpolation.  Initial,  intermediate  target,  and  final  conditions,  denoted  pi,  p2,  and 
p3,  are  criteria  established  in  the  problem  scenarios  and  provide  a  means  to  ensure 
the  spline  fit  data  meets  constraint  criteria.  For  purposes  of  this  discussion,  consider 
phase  1  between  pi  and  p2,  and  phase  2  as  between  p2  and  p3.  In  order  to  allow  for 
a  broad  range  of  trajectories,  intermediate  points  are  chosen  in  each  of  the  phases. 
The  Euclidean  distance  connecting  pi  to  p2  and  p2  and  p3  is  computed  as  d3  and  d2 , 
respectively.  Beginning  with  phase  1,  a  perpendicular  bisector  L i,  the  length  of  d\ 
is  constructed  at  the  midpoint  between  p\  and  p2.  Points  are  now  chosen  on  this 
perpendicular  bisector  to  add  curvature  to  the  splines.  If  for  example,  a  points  are 
chosen  on  L\,  then  there  are  a  curves  which  can  now  connect  p\  to  p2.  Using  the 
same  method,  sample  points  are  also  chosen  in  phase  2,  such  that  if  f3  points  are 
chosen  there  are  f3  curves  which  can  now  connect  p2  to  p3.  When  the  two  phases 
are  combined  there  are  now  M  =  a  *  f3  possible  splines  that  can  be  constructed  that 
meet  initial,  intermediate  target,  and  endpoint  location  criteria.  The  constructed 
points  can  be  seen  visually  in  Figure  A.l.  After  the  points  through  which  a  spline 
interpolation  is  desired  have  been  identified,  curves  are  parameterized  using  Eugene 
Lee’s  centripetal  scheme,  [148]  through  a  convenient  MATLAB  function  esevn,  which 
is  the  accumulated  square  root  of  the  chord  length.  The  MATLAB  function  spline  can 
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Phase  2  varying 
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Figure  A.l.  Data  Points  Used  to  Fit  Spline 

then  be  used  which  allows  for  specification  of  derivative  conditions  at  the  boundaries. 
In  the  case  of  the  loyal  wingman,  the  derivative  boundary  condition  is  heading,  -0, 
computed  as  yh  Any  initial  heading  can  be  supplied  in  the  construction  of  the  spline 
by  specifying  the  initial  condition  boundary  as  y=sin('0),  x  =  cos (0).  Specification 
of  the  boundary  condition  is  very  important  in  the  loyal  wingman  PSO  because  the 
algorithm  will  simulate  the  controls  with  an  identified  initial  heading.  If  the  spline 
fit  does  not  return  an  initial  ^  equivalent  to  the  initial  condition  specified  by  the 
loyal  wingman  problem,  then  the  simulation  will  not  produce  a  trajectory  that  meets 
constraints.  Choosing  a  =  (3  =  7  and  0 1  =  |,  0/c  =  0,  the  returned  49  spline 
fit  curves  can  be  seen  in  Figure  A. 2.  No  consideration  is  given  to  avoiding  threats 
when  producing  these  trajectories.  This  allows  for  the  deterministic  production  of 
seeds  as  a  general  algorithm  that  can  be  used  as  the  loyal  wingman  optimal  control 
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problem  scenarios  are  varied.  At  this  point,  a  specified  set  of  points  (in  this  case  100 


X  vs  Y 
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X  Position 

Figure  A. 2.  49  Trajectories  Representing  49  Control  Vector  ‘Particles’ 

evenly  spaced  points  for  each  spline)  have  been  identified  through  a  parameterized 
spline  interpolation,  but  the  goal  is  to  achieve  a  set  of  controls  as  particles  in  the 
PSO  algorithm.  This  is  achieved  by  using  the  data  output  from  the  spline  function 
to  derive  heading  and  heading  rate.  Before  this  can  be  done,  the  data  must  be  re¬ 
parameterized  to  time.  The  arc  length  between  each  data  point  is  computed  using 
Equation  A.l 

arc(i)  =  ]jl+  "[’])  (*(i  +  1)  -  *(«))  (A.l) 

then  divided  by  the  loyal  wingman’s  constant  velocity  to  achieve  a  time  parameter¬ 
ization  of  the  data  points.  The  data  points  are  then  re-discretized  using  a  spline 
interpolation  to  evenly  spaced  time  units.  There  are  now  49  trajectories  composed  of 
evenly  spaced  data  points  parameterized  to  time.  From  this,  heading  can  be  derived 
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at  each  time  step  i  using 


=  tan  1 


y(j  + !)  -  y(i) 

x{i  +  1)  —  x(i)  ’ 


(A.2) 


and  finally  the  heading  rate  control  can  be  derived  at  each  time  step  through  the 
discrete  equation  of  motion 


u(i) 


i]){i  +  1)  -  ^{i) 

A t 


(A.3) 


The  desired  particles,  heading  rate  control  vectors,  have  now  been  achieved.  However, 
the  time  step  for  each  trajectory  is  slightly  different  and  in  order  to  fit  into  the  loyal 
wingman’s  PSO  architecture,  the  time  steps  must  be  equivalent  for  all  trajectories. 
A  common  vector  length,  chosen  based  on  the  longest  trajectory  and  even  time  step 
is  chosen  to  which  all  control  vectors  are  re-fit.  In  cases  where  the  common  vector 
length  is  beyond  what  is  needed  to  simulate  the  trajectory,  the  remaining  elements 
are  filled  with  0.  This  allows  the  length  of  the  trajectory  to  grow  and  shrink  to  the 
tune  of  the  PSO  algorithm’s  iterations. 
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Appendix  B.  3D  Deterministic  Method  to  Produce  Seeds 


Appendix  A  described  a  deterministic  method  for  producing  seeds  for  a  two- 
dimensional  model.  The  purpose  of  this  index  is  to  update  a  portion  of  Appendix  A 
for  the  production  of  seeds  in  a  three-dimensional  model. 

The  first  step  is  to  produce  a  set  of  possible  three-dimensional  trajectories  with 
Euclidean  state  space  (x,  y ,  z)  data  alone,  using  a  spline  interpolation.  Refering  to 
Figure  A.l,  initial,  intermediate  target,  and  final  conditions,  denoted  pi,  P2,  and  p3, 
respectively,  are  criteria  established  in  the  problem  scenarios  and  provide  a  means  to 
ensure  the  spline  fit  data  meets  constraint  criteria.  For  purposes  of  this  discussion, 
consider  phase  1  between  pi  and  p-2,  and  phase  2  as  between  p2  and  p3.  In  order  to 
allow  for  a  broad  range  of  trajectories,  intermediate  points  are  chosen  in  each  of  the 
phases.  Considering  only  phase  one,  the  Euclidean  distance  connecting  pi  to  p2  is 


computed  as, 


di  =  \J  ( dx 2  +  dy 2  +  dz 2), 


(B.l) 


where  dx  =  X\  —  x0l  dy  =  yi  —  y0,  and  dz  =  —  z0- 

The  elevation  and  azimuth  in  global  coordinates  are  computed  as 


(B.2a) 

(B.2b) 


In  order  to  construct  a  set  of  candidate  points,  consider  a  set  of  equally  dis¬ 
tributed  distances,  rlxL  e  [0,  eh],  representing  the  radii  of  various  circles,  and  ipiXM, 
evenly  distributed  G  [0,  27t],  representing  an  angular  position  on  each  of  the  circles. 
Algorithm  1  may  then  be  used  to  construct  a  set  of  candidate  phase  one  intermediate 
points  through  which  a  set  of  candidate  splines  may  be  fit. 

Figure  B.l  indicates  the  production  of  intermediate  spline  points.  The  top  subplot 
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Points  Generated  in  3D  for  Spline  Interpolation 
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Algorithm  1  Algorithm  to  Produce  Candidate  Points  Through  Which  to  Fit  a  Spline 
1:  for  i  =  1  :  L  do 
2:  for  j  =  1  :  M  do 

3:  for  k  =  1  :  L  *  M  do 

4:  x(k)  =  xq  +  r(i)  ■  cos(— 7  +  /3(j))  ■  cos(— ^  +  |) 

5:  y(k)  =  y0  +  r(i)  ■  cos(-7  +  /3(j))  ■  sin(-'0  +  f ) 

6:  z(/c)  =  z0  +  r(i)  ■  8m(-j  +  jd(J)) 

7:  end  for 

8:  end  for 

9:  end  for 

is  a  ‘top-view’  orientation,  intended  to  highlight  the  radial  distance  for  generating 
points.  The  bottom  subplot  is  orientated  in  a  view  through  a  soda  straw  that  connects 
Pi  and  p2  to  indicate  angular  positions  in  concentric  circles. 

Finally,  the  derivative  boundary  conditions  are  used  in  the  spline  fit. 

cos (70  •  cos (il; 0)) 

V  sin (70  •  cos(i>Q ))  (B.3) 

sin(7o) 

Once  the  points  are  created,  the  process  continues  as  was  indicated  in  Appendix  A. 
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Appendix  C.  Experiment  to  Choose  PSO  Parameters 


The  PSO  algorithm,  Equation  C.l,  contains  various  parameters  such  as  social 
weightings,  bi,b2  [97]  and  a  constriction  factor,  K  [100], 

v(k  +  1)  =  K[v(k)  +  bin  <S>  (u(k)  -  L)  +  b2r2  <8>  ( u(k )  -  G)]  (C.la) 

u(k  +  1)  =  u(k)  +  v(k  +  1).  (C.lb) 

Chapter  IV  highlighted  convergence  tendencies  associated  with  the  choice  of  param¬ 
eters.  An  experiment  was  performed  to  determine  the  most  appropriate  value  for  the 
constriction  factor  as  well  as  the  choice  of  b\  and  b2  for  use  in  the  loyal  wingman 
optimal  control  problem. 

The  value  0  was  varied  from  4.1  to  7  along  with  the  associated  constriction  factor 
as  determined  from  Equation  4.2.  The  PSO  algorithm  was  run  10  times  for  each 
value  of  0  and  the  average  cost,  calculated  using  Gaussian  quadrature,  and  time  to 
converge  were  recorded.  The  results  can  be  seen  in  Table  C.l.  The  best  cost  in  this 
experiment  was  with  a  0  of  4.2  and  constriction  factor  of  .6417.  0  is  a  sum  of  the  two 
Table  C.l.  Average  Cost  of  Various  Constriction  Factors 


0 

K 

Cost 

Iterations 

7.0 

.2087 

2.4670 

102 

6.2 

.2534 

2.4587 

124 

5.8 

.2845 

2.4371 

140 

5.0 

.3820 

2.4130 

180 

4.2 

.6417 

2.3194 

452 

4.1 

.7298 

2.3402 

454 

social  weighting  factors  so  a  separate  experiment  was  run,  where  0  and  K  are  fixed 
at  4.2  and  .6417,  respectively,  but  the  individual  social  components,  b\  and  b2 ,  are 
varied.  Each  scenario  is  run  10  times  and  the  average  cost  is  calculated  and  captured 
in  Table  C.2.  The  best  cost  was  found  when  b\  was  at  1  or  less.  This  result  is  because 
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a  high  local  weighting  causes  the  search  to  stay  in  the  local  area,  never  moving  its 
search  toward  the  global  best.  When  global  best,  b2  is  weighted  high,  the  particles 
search  outside  their  local  area  in  a  movement  toward  the  globally  best  particle. 


Table  C.2.  Average  Cost  of  Various  Social  Weighting  Factors 


h 

b2 

Cost 

4.0 

0.2 

2.4114 

3.5 

0.7 

2.3468 

3.0 

1.2 

2.3420 

2.5 

1.7 

2.3127 

1.5 

2.7 

2.2817 

1.0 

3.2 

2.2688 

0.2 

4.0 

2.2751 

0.1 

4.1 

2.2715 
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Appendix  D.  Dynamic,  Stochastic  Threat  Size  and  Shape 


Chapter  VI  established  two  parameters  to  form  the  size  and  shape  of  a  moving, 
stochastic  threat.  The  purpose  of  this  appendix  is  to  discuss  the  rationale  behind  the 
choice  of  Equations  6.5  and  6.6. 

Referring  to  Figure  D.l,  the  first  circle  represents  a  known  threat  location  with 
known  shaping  parameter,  p(i)  =  p(0)  =  2  to  form  an  ellipse  whose  semi-minor  x-axis, 
a*(0),  is  aligned  with  the  road.  When  the  threat  is  moving  and  is  non-deterministic, 
this  work  estimates  the  location  with  a  standard  deviation  using  a  Kalman  filter. 
Referring  to  the  bottom  set  of  circles  in  Figure  D.l,  when  estimating  the  threat 
location,  it  is  not  known  if  the  threat  is  at  the  filter  estimated  position  (center  circle) 
or  if  it  is  a  standard  deviation  away  (two  end  circles)  or  somewhere  in  between.  This 
work  proposes  avoidance  of  the  entire  region  of  a  threat’s  possible  location  at  each 
time-step.  Two  parameters  are  used  to  model  the  stochastic  threat  avoidance  region. 
The  Erst  is  the  size  parameter 


a>(i)  —  Ox(0)  +  cr(i)  (D.l) 

Vi  =  1  :  N  timesteps,  which  extends  the  axis  along  the  road  by  the  standard  deviation 
cr(i)  at  each  timestep.  The  shaping  parameter  is  chosen  to  keep  constant  the  axis 
distance  perpendicular  to  the  road,  ay,  and  vertical,  az,  while  extending  the  axis 
along  the  road  as  a  cylindrical  shape  according  to 

p(*)  —  P(°)  +  “TjW  (D-2) 

a((J) 

Vi  =  1  :  N  timesteps.  When  the  standard  deviation  is  small  in  relation  to  the 
original  length  of  the  axis  along  the  road,  ax(0),  the  shaping  parameter  p  =  2  and 
the  superquadric  takes  on  the  appearance  of  an  ellipse.  However,  as  the  standard 
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P(0  =  P(0)  =  2 


Figure  D.l.  Known  and  Unknown  Threat  Locations 

deviation  increases,  the  shaping  parameter  forms  a  cylindrical  shape  whose  curvature 
flattens  along  the  distance-increasing  axis  of  the  road  as  well  as  the  bases  of  the 
cylinder  on  each  end.  Results  in  Chapter  VI  show  the  changing  size  and  shape  of  the 
superquadric  as  the  threat  stochastics  are  propagated. 
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Appendix  E.  Distance  to  Re-Plan  Formulation 


Chapter  VII  provided  set  of  equations  to  determine  a  time  to  replan,  tr,  based 
on  assumed  threat  speed  and  stochastics  as  well  as  the  relative  distance  and  angle  of 
approach  of  the  loyal  wingman  to  the  threat.  This  appendix  demonstrates  through 
simulation,  the  effect  Equations  7.1,  7.4,  7.5,  and  7.6  have  on  determining  a  time  to 
re-plan.  Referring  to  the  Erst  subplot  of  Figure  E.l  ,  a  grid  layout  shows  the  loyal 
wingman  and  threat  time-dependent  trajectories. 

The  location  of  the  loyal  wingman  and  the  threat  are  indicated  at  time-steps  T1 
through  T4.  T1  is  where  the  simulation  begins.  Between  T1  and  T2,  the  center 
subplot  indicates  an  angle  of  approach  that  is  oblique,  close  to  head-on  and  does  not 
change.  The  value  of  £  increases  slowly  due  to  the  moving  threat,  while  the  bottom 
subplot  indicates  the  distance  between  the  two  bodies  is  slowly  increasing. 

Between  time-steps  T2  and  T3,  the  loyal  wingman  maneuvers  and  at  T3,  the  angle 
of  approach  to  the  threat  vector  is  perpendicular.  This  can  be  seen  in  the  center 
subplot  where  the  value  of  the  angle  of  approach  and  £  drop  to  zero.  This  angle  of 
approach  is  only  temporary,  though  and  between  T3  and  T4,  the  angle  of  approach 
steadies.  The  bottom  subplot  indicates  a  slow  decrease  in  the  distance  between  the 
two  bodies,  causing  £  and  p  to  increase.  This  continues  until  Equation  7.2  is  satisfied 
at  T4  and  a  re-plan  occurs. 

Referring  to  Figure  E.2,  a  new  trajectory  has  been  produced  at  T4  and  £  is  reset 
to  0  because  the  distance  the  threat  has  traveled  is  reset  to  0.  Between  T4  and  T5, 
the  distance  between  the  two  bodies  decreases,  causing  £  and  p  to  increase.  This 
time,  however,  Equation  7.2  is  not  satisfied  and  at  T5,  the  loyal  wingman  and  threat 
begin  to  move  away  from  each  other.  Because  the  loyal  wingman  is  traveling  at  a 
speed  that  is  much  greater  than  the  threat,  H  and  p  diverge  and  Equation  7.2  is 
never  satisfied,  i.e.,  there  is  no  re-plan. 
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Figure  E.l.  Trajectory  1:  Effect  of  p,  and  H 
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